Abstract
Introduction
Superior labrum anterior and posterior (SLAP) lesions of the glenoid labrum refer to the detachment of the superior labrum extending from anterior to posterior, potentially involving the attachment site of the long head of the biceps tendon. 1 Both the superior labrum and the biceps tendon anchor serve to enhance joint stability by acting as secondary stabilizers of the shoulder. SLAP lesions are considered a significant cause of shoulder pain and dysfunction. With the widespread adoption of arthroscopic techniques, an increasing number of SLAP lesions are being identified. Onyekwelu et al. 2 reported a 464% increase in arthroscopic SLAP repairs performed in hospitals in the northeastern United States between 2002 and 2010. Consequently, advancing diagnostic approaches for SLAP lesions has become essential for optimal patient management.
Physical examination and clinical presentations for SLAP lesions lack sufficient diagnostic accuracy and demonstrate considerable variability across studies.3–9 Although MRI of the shoulder is currently the primary auxiliary diagnostic tool for SLAP lesions, studies have demonstrated that conventional magnetic resonance imaging (MRI) exhibits suboptimal diagnostic accuracy for SLAP lesions, with a sensitivity of 60% and specificity of 92.5%. This limitation may be attributed to the anatomical complexity of SLAP lesions.10–12 Magnetic resonance arthrography (MRA) of the shoulder joint employs contrast enhancement by injecting a contrast agent into the glenohumeral joint space. 13 Nevertheless, MRA has certain limitations, including its invasive nature, the risk of joint infection, and unsuitability for patients with contrast agent allergies. Therefore, MRA is typically reserved as a supplementary examination in cases where conventional MRI proves inconclusive, rather than being employed as a routine diagnostic measure.
In recent years, with the widespread application of machine learning and the rapid development of radiomics, researchers have begun to explore the application of radiomics in cartilage, ligaments, and tendons.14,15 Oeding et al. 16 developed an XGBoost-based machine learning model for subscapularis tear prediction using preoperative imaging, achieving excellent performance with an accuracy of 0.85. While Fei et al. 17 established a radiomics-based model for rotator cuff tear diagnosis, achieving AUC of 0.989 and 0.979 in the training and validation cohorts, respectively. These models achieved favorable diagnostic performance.
Radiomics enables high-throughput quantitative extraction of high-dimensional imaging features from routine medical images. This approach allows for a comprehensive quantitative analysis of lesion heterogeneity, offering promising prospects in non-invasive preoperative diagnosis. To enable early detection and accurate diagnosis of SLAP lesions through more precise and non-invasive intelligent diagnostic approaches, the present study aims to establish a machine learning model based on clinical characteristics and radiomic features, and to evaluate its efficacy in diagnosing SLAP lesions. This research endeavors to provide valuable insights for the development of intelligent auxiliary tools in SLAP injury diagnosis.
Method
Methods and tools
Participants in the study
This study retrospectively collected clinical information and imaging data from patients who underwent shoulder arthroscopic surgery at our hospital from January 1, 2019, to December 31, 2024. Inclusion Criteria included: (1) Patients undergoing their first shoulder arthroscopic surgery at our hospital; (2) Preoperative shoulder MRI examination conducted at our hospital, including the following sequences: axial, oblique sagittal, and oblique coronal proton density-weighted fat-suppressed imaging (PD FS) and oblique coronal T1-weighted imaging (T1WI); (3) arthroscopically confirmed SLAP Type I or Type II lesions based on the Snyder classification system. Exclusion Criteria included: (1) Incomplete clinical data; (2) Missing MRI images, poor imaging quality, or incomplete MRI sequences; (3) Previous history of shoulder surgery; (4) arthroscopically confirmed Bankart lesions, shoulder dislocations, or glenohumeral instability. Figure 1 illustrates the flowchart depicting the process of study selection based on inclusion and exclusion criteria. Flow diagram of subject enrollment.
Collection of patient clinical information
We retrospectively collected clinical data from the hospital information system (HIS), including basic clinical information such as age, gender, history of shoulder injury, duration of illness, and imaging results from radiologists. A history of shoulder injury refers to any injury causing shoulder pain and/or limited mobility. The duration of illness was recorded in months. Two radiologists with experience in musculoskeletal imaging independently analyzed the shoulder MRI images, reaching a consensus after joint review in case of disagreement. They were blinded to any personal information or relevant clinical data of the patients during the assessment. All arthroscopic surgeries were performed by experienced surgeons from the orthopedics department of our hospital. Under arthroscopic assistance, through posterior and anterior/anterolateral approaches, the glenohumeral joint, long head of the biceps tendon, and biceps-labral complex were carefully examined. The condition of the superior labrum was observed and an intraoperative diagnosis was made based on the arthroscopic findings of SLAP lesions. If the superior labrum tear or detachment of the superior labrum from the glenoid surface at the biceps tendon anchor was found during surgery, it was recorded as a SLAP injury. In this study, only patients with SLAP lesions classified as types I to II according to the Snyder classification were included. Type I and Type II SLAP lesions were selected because they represent the most commonly encountered superior labral pathologies in clinical practice. Figure 2 shows images from intraoperative exploration. Simultaneously, the glenohumeral joint cavity and subacromial bursa were meticulously examined to assess the extent of rotator cuff pathology. The condition of the rotator cuff tendons was thoroughly documented and categorized into three distinct classifications: intact, partial-thickness tear, or full-thickness tear. Both isolated SLAP lesions and SLAP lesions with concomitant rotator cuff tears (partial- or full-thickness) were included, with rotator cuff tears analyzed as a clinical variable. Arthroscopic examination of the superior labrum of the shoulder joint.
Research methodology
This study is a retrospective analysis that has been approved by the hospital ethics committee, adhering to ethical standards. The primary research protocol encompasses the following processes: (1) Acquisition of MRI images (2) Delineation of regions of interest (3) Extraction and screening of radiomics features (4) Construction and evaluation of radiomics models (5) Assessment of traditional manual diagnostic efficacy (6) Development of an integrated diagnostic model combining radiomics features with clinical characteristics, and construction of a nomogram.
Magnetic resonance image collection and delineation of target regions
The magnetic resonance images utilized in this study were acquired using two 3.0 T MRI scanners (SIGNA Pioneer 3.0 T, GE Healthcare). All patients underwent non-contrast shoulder joint MRI examinations. For radiomics analysis, oblique coronal PD FS images were selected from each scan. The acquisition parameters were as follows: repetition time (TR) = 1800–2200 ms, echo time (TE) = 30–50 ms, slice thickness = 4 mm. All images were preserved with standard soft tissue settings and stored in Digital Imaging and Communications in Medicine (DICOM) format. ITK-snap software was employed to manually delineate the region of interest (ROI) of the superior labrum of the shoulder joint on the collected DICOM files.
18
The superior labrum was contoured layer by layer along its margins on oblique coronal PD FS images. For each patient, 3–5 images were delineated from anterior to posterior, as illustrated in Figure 3. The areas delineated on these 3–5 images were saved as a single ROI file for each patient. Complete superior labrum regions were delineated in a total of 149 magnetic resonance images. Manually delineated region of interest on the superior labrum of the shoulder joint.
To evaluate the reproducibility and consistency of ROI delineation, intra-observer variability were assessed using the intraclass correlation coefficient (ICC). For intra-observer agreement evaluation, the same radiologist repeated the image segmentation in 30 patients who were randomly selected from the study cohort after an 8-weeks interval. ICC values were calculated based on a single rater, absolute agreement, 2-way random-effects model to assess the stability and reproducibility of radiomic features. Only cases with ICC values greater than 0.75 for ROI segmentation were included in the subsequent radiomics feature extraction and analysis to ensure reliable and reproducible results.
Extraction and screening of radiomics features
Radiomics features can be categorized into three distinct groups: morphological features, first-order features, and textural features. Morphological features delineate the three-dimensional shape characteristics of the contoured region. First-order features describe the first-order statistical distribution of voxel intensities within the delineated area. Textural features elucidate intensity patterns or spatial relationships among voxels within the contoured region. In this study, textural features were extracted using various methodologies, including the gray-level co-occurrence matrix (GLCM), gray-level run length matrix (GLRLM), gray-level size zone matrix (GLSZM), and neighborhood gray-tone difference matrix (NGTDM) approaches. 19 All radiomics features were extracted using an internal feature analysis program implemented in pyradiomics (https://pyradiomics.readthedocs.io). 20 The radiomics feature extraction process adhered to the Image Biomarker Standardization Initiative (IBSI) guidelines.
All radiomics features were treated as continuous variables. For the features extracted from the images, we initially employed the Shapiro-Wilk test to assess their distribution, determining whether they followed a normal or non-normal distribution. Subsequently, we utilized the Mann-Whitney U test or independent samples t-test to evaluate significant differences in these features between groups. For normally distributed features, we employed the independent samples t-test for statistical testing and feature screening. For non-normally distributed features, we utilized the Mann-Whitney U test. Only radiomics features with a two-sided
Establishment and evaluation of radiomics models
Following Lasso feature selection, we input the final features into machine learning models. This study employed three machine learning models: Support Vector Machine (SVM), Random Forest (RF), and Light Gradient Boosting Machine (LightGBM).23,24 These models were used to construct separate diagnostic models for SLAP lesions.
Radiomics features extracted from 149 shoulder MRI scans were used for model construction. 104 imaging datasets were randomly assigned to the training set for building the SLAP lesion diagnostic models, while the remaining 45 were allocated to the test set for evaluating diagnostic performance.
The predictive performance of the models was assessed using Receiver Operating Characteristic (ROC) curves, with the AUC calculated. 25 We also evaluated the accuracy, sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) of each model. Sensitivity refers to the ability to correctly identify patients with the condition as determined by the gold standard, also known as the true positive rate. 26 Specificity is the ability to correctly identify those without the condition, also known as the true negative rate. 27 The PPV represents the probability of actually having the disease when the model predicts a positive result, while the NPV represents the probability of not having the disease when the model predicts a negative result. Additionally, Decision Curve Analysis (DCA) was conducted to assess the clinical utility of each model by calculating the net benefit across various threshold probabilities, comparing model performance against the strategies of treating all or no patients. 28
Evaluation of traditional manual diagnosis
Two radiologists with experience in musculoskeletal imaging independently analyzed the shoulder MRI images and determined the presence or absence of SLAP lesions. Disagreements were resolved through consensus after joint review. Data analysis was performed using
Nomogram of the combined model
Univariable logistic regression analysis of clinical variables associated with SLAP lesion.
Multivariable logistic regression analysis of clinical variables associated with SLAP lesion.
Result
Data results
This study retrospectively collected clinical data from patients who underwent MRI scans and shoulder arthroscopic surgery at our hospital between January 1, 2019, and December 31, 2024. A total of 268 patients were initially collected. After applying inclusion and exclusion criteria, 149 patients were ultimately included in the study. The diagnosis of SLAP lesions was based on arthroscopic findings during shoulder arthroscopic surgery. Among the included patients, 66 cases were diagnosed with SLAP lesions, while 83 cases were without SLAP lesions.
Demographic and clinical characteristics of patients in training and test sets.
Feature statistics
Radiomic features.

Distribution of all radiomics features and their
After the initial filtering process based on P-values, 22 features were identified. This number was subsequently reduced to 19 features based on Pearson correlation analysis. The LASSO regression procedure finally reduced the number of features to only 9, which were used for further analysis. Lasso Feature Selection Process: We conducted a 10-fold cross-validation to determine the optimal penalty coefficient λ. The Lasso cross-validation curve and regression coefficient path are illustrated in Figure 5. As evident from the graph, the optimal λ value was ascertained to be 0.0222. The features corresponding to non-zero coefficients at this λ value were retained for subsequent regression model fitting. Parameter selection and feature selection for the Lasso regression model.
The selected features encompass: Two first-order features (original_firstorder_Kurtosis and original_firstorder_Skewness), four texture features (including gray-level co-occurrence matrix features: original_glcm_ClusterShade and original_glcm_MCC, as well as gray-level size zone matrix features: original_glszm_LargeAreaHighGrayLevelEmphasis and original_glszm_SmallAreaEmphasis), and three shape features (original_shape_Flatness, original_shape_Maximum2DDiameterColumn, and original_shape_Sphericity). Figure 6 illustrates the nine selected radiomics features and their corresponding coefficients in the Lasso model. Selected radiomics features and their coefficients.
Machine learning models based on radiomics features
The diagnostic performance of the three radiomics models.

ROC curves of the three machine learning models evaluated on the training and testing datasets.
In the test set: The SVM model showed the highest specificity among the three models at 0.792. The LightGBM model exhibited the highest sensitivity at 0.952. Comparing the diagnostic performance of the three machine learning models based on the test set AUC: The LightGBM model demonstrated the best performance, achieving the highest AUC value of 0.867 in the test cohort. Notably, the test set AUC (0.867) for the LightGBM model was slightly lower than its training set AUC (0.896), indicating good generalization without significant overfitting.
DCA on the independent test cohort (Figure 8) validated the clinical utility observed in the training cohort (Figure 9). All three models (SVM, RF, and LightGBM) demonstrated positive net benefit compared to treating all or no patients across clinically relevant threshold probabilities. DCA curves of the three machine learning-based models in the test sets. DCA curves of the three machine learning-based models in the training sets.

In the test cohort, the RF model maintained its superior performance, showing the highest and most stable net benefit across the majority of threshold probabilities ranging from 0.1 to 0.8. The LightGBM model exhibited comparable performance to RF, particularly in the moderate threshold probability range (0.2–0.6), while demonstrating slightly lower net benefit at higher thresholds. The SVM model, although showing positive net benefit, displayed more limited clinical utility with a narrower range of beneficial threshold probabilities compared to the other two models.
Diagnostic value of traditional manual diagnosis
Two radiologists with experience in musculoskeletal imaging independently analyzed the shoulder MRI images. Among 149 patients, there were 23 true positives, 52 true negatives, 14 false positives, and 60 false negatives. The accuracy of MRI plain scan for the artificial diagnosis of SLAP injury was 50.3%, with a diagnostic sensitivity of 27.7%, specificity of 78.8%, and AUC of 0.619. The traditional manual use of MRI plain scans for diagnosing SLAP injuries has limited effectiveness.
Nomogram
The integrated diagnostic model, amalgamating predictions from the lightGBM model and clinical characteristics, has been visualized in the form of a nomogram, as illustrated in Figure 10. In the training cohort, the combined model achieved an impressive AUC of 0.911. When applied to the test group, the model maintained robust performance with an AUC of 0.899. The training set model demonstrated a specificity of 83.1% and sensitivity of 84.4%, while the testing set model exhibited a specificity of 91.7% and sensitivity of 76.2%. Nomogram of the combined model.
Discussion
The mechanism of SLAP lesions is quite complex, and current research still lacks a comprehensive understanding. At present, there is a lack of effective methods for diagnosing SLAP lesions in clinical practice.30–34 This study developed a diagnostic model for SLAP lesions based on MRI radiomics. By analyzing nine radiomic features highly correlated with SLAP lesions, the diagnostic model we constructed achieved an AUC of 0.867, sensitivity of 0.952, and specificity of 0.625 on the test set. Furthermore, when comparing the diagnostic performance of this model with traditional manual diagnostic methods, the results showed that the machine learning model outperformed manual diagnosis. Additionally, we developed a combined model incorporating both clinical and radiomic features, presented as a visual nomogram. This model exhibited strong diagnostic ability on the test set evaluation, with an AUC of 0.899, sensitivity of 0.762, and specificity of 0.917. This further confirms the excellent diagnostic capability of the radiomics model for SLAP lesions.
Imaging examinations play an irreplaceable role in the preoperative identification and evaluation of SLAP lesions. However, at present, radiologists can only provide qualitative information and semi-quantitative data from medical images, without the ability to further quantify image content. This limitation results in a vast amount of imaging data remaining underutilized.35,36 Radiomics, by contrast, treats images as mineable quantitative data, capable of extracting information that may not be discernible to the naked eye—essentially, “images are more than pictures, they are data”.37,38 The extracted radiomic features quantitatively describe aspects such as intensity distribution, spatial relationships, textural heterogeneity, lesion morphology, and the interactions between lesions and surrounding tissues. 39 These features can then be linked to clinically relevant outcomes, paving new avenues for the application of artificial intelligence in precision diagnosis. In recent years, radiomics has shown tremendous potential in the diagnosis and prognosis of orthopedic diseases.22,40 Varriano et al. 41 reported that their radiomics-based model demonstrated superior diagnostic performance for adhesive capsulitis detection, with the automated method achieving an accuracy of 0.7455 versus 0.5409 for beginner-level radiology residents.
In the study by Fei et al., 17 a machine learning model for detecting rotator cuff tear was developed and validated based on radiomic features from the supraspinatus region. Furthermore, a predictive model for rotator cuff re-tear was developed by integrating radiomic features extracted from three distinct anatomical regions: the supraspinatus, infraspinatus, and humeral head. The predictive performance of this radiomic model in forecasting postoperative re-tear was comprehensively assessed. The findings confirmed that radiomics-based models achieved strong performance in both diagnosing rotator cuff tear and predicting postoperative re-tear, thereby opening new possibilities for applying radiomics to the diagnosis of other soft tissue disorders of the shoulder.
In the present study, our radiomics model for diagnosing SLAP lesions outperformed conventional manual diagnostic methods. Both the AUC and accuracy of the model exceeded those of manual diagnosis, underscoring the advantage of radiomics in diagnostic precision. By quantitatively analyzing lesion heterogeneity, radiomics offers a more accurate means of characterizing pathological changes compared with traditional visual interpretation. Conventional manual diagnosis is constrained by the limited capacity of human visual perception to identify subtle imaging characteristics, particularly inconspicuous radiological features, which consequently reduces diagnostic precision. Radiomics, by quantifying imaging features and applying standardized evaluation criteria, enables the more sensitive and precise detection of such subtle differences.
In a study conducted by Kibler et al., 42 the clinical experience of surgeons in diagnosing SLAP lesions was investigated. The results showed that 57% of surveyed surgeons regarded arthroscopy as the most accurate diagnostic method, while 36% considered clinical history to be the most important factor, followed by imaging findings and physical examination. This underscores the significance of clinical history in the diagnostic process.
The results of this study demonstrated that the combined radiomics–clinical model outperformed the radiomics model. This may be explained by the fact that the MRI-based radiomics model exclusively captured pathological and anatomical information of the superior labrum, without incorporating other clinically relevant dimensions. Since the development of SLAP lesions is a dynamic process, integrating clinical indicators provides a more comprehensive reflection of the disease profile: patient age reflects the degree of labral degeneration. Radiological interpretation by attending radiologists or surgeons represents a pivotal diagnostic reference in the clinical assessment of SLAP lesions. In this investigation, radiological diagnostic determinations were incorporated as a fundamental clinical variable for model construction. The inclusion of such multidimensional information substantially enhances the predictive capacity of radiomics models. At present, there is no consensus regarding the clinical presentation and imaging standards for diagnosing SLAP lesions. Therefore, in clinical practice, diagnostic decisions should be made by synthesizing multiple sources of evidence rather than relying on a single modality.
This study has several limitations that should be acknowledged. First, the retrospective single-center design and relatively small, unevenly distributed sample size may have introduced bias and increased the risk of model overfitting. To enhance generalizability and stability, future work should involve larger cohorts and multi-center external validation. Second, image segmentation in this study relied on manual delineation of regions of interest by experienced radiologists. Although this ensured expert anatomical input, manual segmentation is inherently time-consuming and operator dependent, which limits practicality for real-world use and may reduce reproducibility. Moreover, the present study does not provide an automated or clinically feasible workflow for implementation. Third, the clinical features included in this study did not encompass specialized physical examination maneuvers specifically designed for the assessment of SLAP lesions. The absence of these diagnostic indicators may have limited the predictive performance of the clinical model. Incorporating targeted physical examination tests in future studies could strengthen diagnostic accuracy and enhance clinical applicability. Although our models demonstrated improved diagnostic performance compared with routine interpretation of non-contrast MRI by radiologists, clinical management of suspected SLAP lesions is multifactorial and should not rely on imaging findings alone. In contemporary practice, treatment decisions are primarily guided by patient age, symptom profile, activity demands, concomitant shoulder pathology, and response to nonoperative management. Finally, although DCA suggests clinical utility across relevant risk thresholds, this study is designed solely as an adjunctive decision-support tool to reduce diagnostic uncertainty and does not demonstrate how improved diagnostic accuracy would alter surgical indications or reduce unnecessary arthroscopies. Subsequent investigations should incorporate validated patient reported outcome measures specific to the shoulder to determine whether pathways guided by the model achieve clinically meaningful improvements, interpreted using established minimal clinically important difference (MCID) or Patient Acceptable Symptom State (PASS) thresholds.
Conclusion
In summary, this study developed and validated machine learning models based on both clinical characteristics and radiomics features for the diagnosis of SLAP lesions. The radiomics model, constructed from shoulder MRI-derived features, demonstrated effective identification of SLAP lesions and achieved higher diagnostic accuracy compared with conventional manual assessment. Moreover, the combined model integrating radiomics and clinical features further enhanced predictive performance for SLAP lesion risk and was visualized through a nomogram, offering a practical tool for clinical application.
