Abstract
Introduction
Neuropathic pain (NP) represents one of the most debilitating and under-recognized complications among cancer patients, significantly impairing quality of life and therapeutic adherence. It can arise from direct tumor invasion of neural structures or as a consequence of oncological treatments such as surgery, chemotherapy, or radiotherapy. 1 Chemotherapy-induced peripheral neuropathy (CIPN), post-surgical neural trauma, and treatment-related neurotoxicity are among the most prevalent etiologies, often persisting long after cessation of treatment and contributing to chronic suffering and functional impairment.2,3 The clinical burden is compounded by the subjective and multifactorial nature of NP, which includes sensory abnormalities, emotional distress, and altered pain perception, making its early detection and prediction extremely challenging using traditional assessment methods.
Timely prediction of NP in oncology is imperative to minimize irreversible neural damage, personalize analgesic regimens, and enhance rehabilitation strategies. Despite clinical awareness of NP, current assessment tools are largely reactive rather than predictive, often relying on patient-reported outcomes such as the Brief Pain Inventory (BPI), Numerical Rating Scale (NRS), Visual Analog Scale (VAS), and Douleur Neuropathique 4 (DN4).4,5 These instruments, while useful, are inherently limited by subjectivity, inter-individual variability, and lack of integration with multimodal biological or behavioral data. As a result, clinicians are often unable to preemptively identify high-risk individuals before the manifestation of severe or chronic symptoms, thereby missing a critical window for intervention.
Amid these challenges, artificial intelligence (AI) and machine learning (ML) have emerged as transformative tools in oncology and pain medicine, offering data-driven solutions capable of identifying complex, non-linear associations among heterogeneous clinical, behavioral, imaging, and molecular features. 6 These technologies have shown growing promise in prognostication, diagnostic classification, and therapeutic optimization across medical disciplines, particularly in cancer research, where precision medicine paradigms necessitate the integration of high-dimensional, real-time patient data.7,8 In the context of NP, AI can enhance early prediction by combining diverse modalities such as lipidomics, 3 radio mics, 9 psychometric profiling, 10 and facial and speech emotion recognition, 11 thereby improving predictive accuracy and informing tailored pain management strategies. Furthermore, AI applications and ML are also integrated in numerous areas in healthcare, for example, in maternal and delivery care, for predicting postpartum hemorrhage. 12 Also using ML in predicting preeclampsia.13,14
Despite the increasing publication of AI-based models aimed at predicting cancer-related NP, a significant gap persists in translating these technologies to real-world clinical practice. Several reviews have highlighted a lack of methodological rigor, limited external validation, and poor calibration of predictive models.1,15 Moreover, the generalizability of these models is constrained by heterogeneous datasets, underrepresentation of diverse populations, and a lack of standardized outcome measures. Furthermore, interpretability and explainability remain key limitations of AI-driven predictions, which restrict clinician trust and regulatory acceptance, especially when deep learning (DL) architectures are used without transparent feature attribution.2,16 Therefore, a comprehensive synthesis of current evidence is necessary to assess the landscape, evaluate the performance and clinical readiness of existing models, and identify future directions for the responsible and effective integration of AI in NP prediction.
This systematic review aims to critically examine and consolidate empirical research on applying ML and AI to predict NP and related outcomes in cancer patients. This review seeks to elucidate key trends, highlight high-performing methodologies, and uncover common predictors and limitations in the field by synthesizing findings across studies involving diverse data types and model architectures. In doing so, we aim to provide clinicians, researchers, and digital health innovators with a detailed, evidence-based roadmap for enhancing pain prediction in oncology, thereby contributing to the broader movement toward personalized and precision pain care.
Methods
Eligibility criteria
Studies were selected based on predefined eligibility criteria grounded in the Population, Intervention, Comparator, Outcome, Study Design (PICOS) framework to ensure methodological rigor and relevance to the research objective. Eligible studies included original empirical investigations that applied ML or AI techniques to predict, classify, or diagnose NP specifically in oncology populations. The target population encompassed individuals with any type of cancer, irrespective of age, sex, tumor site, or treatment modality (e.g., surgery, chemotherapy, radiation). Interventions of interest were computational models employing supervised, unsupervised, or DL algorithms utilizing clinical, imaging, behavioral, or molecular data to forecast NP incidence, severity, or progression. Studies were required to report measurable outcomes such as model performance metrics (e.g., area under the receiver operating characteristic curve (AUC), sensitivity, specificity, accuracy), pain classification scores, or predictive biomarkers associated with neuropathic symptomatology. Exclusion criteria encompassed studies focused on non-cancer populations, general or nociceptive pain unrelated to neuropathic mechanisms, and those lacking a computational predictive component. Reviews, commentaries, editorials, animal studies, and case reports were excluded unless narrative reviews were specifically included to contextualize technological advancements or highlight gaps in empirical evidence. Only articles published in English and involving human participants were considered for inclusion to maintain consistency in reporting standards and clinical applicability.
Information source
The search targeted empirical studies with defined samples, measurable outcomes, and validated instruments, using multiple academic databases including PubMed, EMBASE, Web of Science, IEEE Xplore, and Google Scholar. This systematic review was conducted following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) 2020 guidelines to ensure transparency, reproducibility, and methodological rigor throughout the research process. The review protocol was prospectively registered in the International Prospective Register of Systematic Reviews (PROSPERO), hosted by the National Institute for Health Research (NIHR), to avoid duplication and enhance credibility. The registration was completed before the formal literature search, with the registration number: [CRD42025642950]. The protocol specifies the review objectives, inclusion and exclusion criteria, data sources, and analytical tools to guarantee that the study selection and synthesis follow a planned, systematic methodology.
Search strategy
A comprehensive search strategy was undertaken to identify peer-reviewed, English-language studies published between January 2020 and February 2025 that applied AI or ML to predict or assess NP in cancer patients. The last database search was conducted on 14 February 2025. Search terms included combinations of keywords and Boolean operators such as (“neuropathic pain” OR “chronic postsurgical pain” OR “chemotherapy-induced peripheral neuropathy” OR “CIPN”) AND (“cancer” OR “oncology” OR “tumor”) AND (“machine learning” OR “artificial intelligence” OR “deep learning” OR “predictive model” OR “radiomics”) AND (“diagnosis” OR “prediction” OR “risk assessment”). Inclusion criteria required studies to involve cancer-related NP (e.g., breast, colorectal, brain, ovarian, and hepatocellular carcinomas), utilize AI/ML methods for prediction or classification, include human participants with defined sample sizes (ranging from
Selection process
The study selection process followed the PRISMA 2020 guidelines to ensure transparency and reproducibility. A total of 723 records were identified through comprehensive database searching across PubMed, EMBASE, Web of Science, IEEE Xplore, and Google Scholar. After removing duplicates and title/abstract screening, 112 records were selected for further evaluation. Of these, 98 full-text articles were assessed for eligibility based on predefined inclusion and exclusion criteria. Subsequently, 84 articles were excluded for reasons such as lack of relevance to the intervention or outcome, absence of NP focus in cancer populations, and inadequate study design (e.g., editorials, commentary pieces, or reviews without empirical data). Ultimately, 14 studies met all eligibility criteria and were included in the final synthesis. The selection process was independently conducted by two reviewers to minimize bias, with discrepancies resolved through discussion or consultation with a third reviewer. The detailed study selection pathway is illustrated in Figure 1 (PRISMA flow diagram).

Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) flow diagram.
Data Collection, Extraction, and Synthesis Process
The data collection and extraction process were guided by a structured evidence matrix that systematically captured key variables from each of the 14 included studies, including author and year, study design, sample size and population, AI/ML techniques used, pain assessment instruments, and primary outcomes such as model performance metrics (e.g., AUC, RMSE, sensitivity, accuracy). This approach ensured consistency in evaluating the methodological rigor and predictive capabilities across diverse studies, including systematic reviews, narrative reviews, cohort studies, and experimental designs. The resulting matrix is presented in Appendix A.
Quality Assessment and Risk of Bias
The quality assessment of the 14 included studies was conducted using QUADAS-2 for reviews and narrative syntheses, and PROBAST for predictive modeling studies. (See the Table in Appendix B.) Overall, the majority of studies were judged to have low to moderate risk of bias. High-quality modeling studies such as those by Salama et al.
1
, Sun et al.
5
and Wirries et al.
17
demonstrated robust methodology, adequate sample sizes, and sound model calibration. Conversely, narrative reviews (e.g., Qian et al.,
2
Khalighi et al.,
7
Naseri et al.,
16
) were rated moderate due to a lack of primary data or external validation. Only Cascella et al.
11
had a higher risk due to its exploratory nature and minimal sample size (
Results
Overview of included studies
A total of 14 studies published between 2006 and 2024 were included in this systematic review, selected based on their empirical or theoretical contributions to the application of AI and ML in predicting or assessing NP among cancer patients. The majority of the studies originated from high-income countries, particularly the United States, Germany, Canada, and China, reflecting the dominant role of technologically advanced healthcare systems in AI-driven oncology research. However, several important contributions also emerged from low- and middle-income countries (LMICs), such as India and Egypt, demonstrating growing global engagement in the use of AI for pain management in oncology despite infrastructural and data limitations.
In terms of study design, most empirical investigations employed prospective cohort methodologies, such as those by Lötsch et al.
3
, Juwara et al.,
4
and Wirries et al.
17
enabling robust data collection and predictive model training using real-world clinical variables and outcomes. A subset of studies, including the work of Sun et al.,
5
leveraged high-quality randomized controlled trial (RCT) datasets for secondary analysis, thereby enhancing internal validity and minimizing confounding. In contrast, narrative and systematic reviews such as those by Salama et al.,
1
Qian et al.,
2
Khalighi et al.,
7
and Naseri et al.,
16
synthesized evidence from multiple sources to provide broad overviews of AI applications in oncology and NP, albeit with varying degrees of methodological rigor and primary data usage. Cascella et al.
11
presented an exploratory observational study with a very limited sample size (
Sample sizes varied considerably, from as few as two participants in small-scale feasibility studies to as many as 46,104 individuals included in the aggregated datasets of comprehensive systematic reviews like the work of Salama et al. 1 This wide variability reflects differences in the scope and generalizability of findings, with larger cohort studies offering more robust statistical power and external validity. The predominant clinical focus across the studies was breast cancer, likely due to the high prevalence of CPSP and chemotherapy-induced neuropathy in this population. Other cancer types explored included gynecologic (e.g., ovarian), colorectal, hepatocellular, and central nervous system malignancies. Notably, many studies used AI to address CIPN, CPSP, or pain related to tumor infiltration of neural structures, highlighting AI's versatility in predicting neuropathic outcomes across treatment stages and anatomical contexts.
Ai and ml techniques applied
Supervised learning models
Supervised learning models were predominantly utilized across the included studies, with SVM appearing in 11 studies and demonstrating strong performance, such as in the work of Guan et al., 18 which reported an AUC of 0.808. RF was featured in over 12 studies and emerged as the top-performing model, achieving AUCs up to 0.94 and a median AUC of 0.81, as highlighted by the work of Salama et al. 1 logistic regression (LR), frequently used as a baseline comparator, showed more modest results, with AUCs ranging between 0.631 and 0.78. Ensemble methods like gradient boosting machines (GBMs) and XGBoost were employed in at least six studies; for instance, Sun et al. 5 reported AUCs of 0.731 (XGBoost) and 0.755 (GBM), with XGBoost also showing superior calibration (ICI = 0.050). NNs, including both shallow and deep architectures, were leveraged for capturing complex interactions, contributing to high classification accuracies, particularly in multimodal studies such as, Cascella et al. 11 which achieved up to 95% F1-scores for emotion-based pain detection.
Unsupervised learning models
Unsupervised learning models were applied less frequently among the reviewed studies, but played a valuable role in exploratory data analysis and pattern recognition. 10 utilized clustering methods such as k-means and hierarchical clustering to categorize breast cancer survivors into subgroups based on distinct pain trajectories, providing insights into psychological predictors of persistent NP. Similarly, Lötsch et al. 3 employed emergent self-organizing maps (ESOM) to differentiate lipidomic profiles in pre- and post-paclitaxel treatment samples, revealing meaningful biochemical shifts associated with CIPN. These unsupervised approaches facilitated the identification of latent structures in complex datasets, contributing to hypothesis generation and biomarker discovery in oncologic pain research.
Deep learning and hybrid models
DL and hybrid models were prominently used in studies involving imaging data and emotional pain assessment. Cascella et al. 11 implemented a multimodal framework combining computer vision and natural language processing, utilizing a multilayer perceptron (MLP) to analyze facial expressions and speech prosody for real-time pain detection, achieving up to 95% F1-scores for sadness recognition. In neuro-oncology, Khalighi et al. 7 employed advanced DL architectures like CNN, 3D U-Net, and Deep Medic to extract radiomic features from MRI scans for tumor segmentation and neuropathic risk assessment, reporting segmentation accuracy with Dice scores up to 0.9 and predictive AUCs exceeding 0.85. These DL models enabled the integration of high-dimensional data, improving both diagnostic precision and the emotional contextualization of pain in cancer patients.
Input features and data modalities
Clinical and demographic variables
Clinical and demographic variables played a central role in many of the predictive models across the included studies, with a consistent emphasis on postoperative pain intensity, anxiety, and depression scores (commonly assessed via HADS), 19 surgical details, menopausal status, body mass index (BMI), prior surgeries, and anesthetic use. For instance, Sun et al. 5 found that postoperative NRS pain scores within 48 h, along with axillary lymph node dissection, were among the strongest predictors of CPSP, with ML models achieving AUCs of 0.749–0.755. 4 also identified acute postoperative pain and anxiety as key features, showing a well-calibrated LR model (slope = 1.00; intercept = 0.03) with RMSE ranging from 1.16 to 1.50. Similarly, Wirries et al. 17 reported that HADS-anxiety scores and BMI were strong influencers in predicting pain trajectories post-lumbar disc herniation, with decision tree regression models yielding mean absolute errors of 1.79–1.97 for pain scores. Collectively, these variables underscore the multifactorial nature of NP and the importance of both physiological and psychosocial predictors in AI-based oncology pain models.
Omics and biomarkers
Omics-based predictors, including genomic, epigenetic, and proteomic features, were pivotal in enhancing the predictive power of AI models for NP in oncology.20,21 As cited in Salama et al., 1 employed ML algorithms such as SVM, RF, and Naïve Bayes to analyze gene polymorphisms and methylation patterns, achieving AUCs ranging from 0.65 to 0.83 in predicting chemotherapy-induced NP among breast cancer patients. Lötsch et al. (2024) 3 conducted a high-resolution lipidomics analysis using LC-MS/MS on pre- and post-treatment plasma samples from 31 breast cancer patients undergoing paclitaxel therapy. Using RF, SVM, and LR models, and supported by ESOM, they identified sphinganine-1-phosphate (SA1P), sphingomyelin 33:1, and 43:1 as key lipid biomarkers distinguishing neuropathy-positive cases with up to 90% median balanced accuracy. In vitro validation showed SA1P activated TRPV1 and S1P receptors in 11.7% of sensory neurons—biologically confirming its role in neuropathic mechanisms. These findings support the integration of omics data with AI for biologically grounded, high-fidelity prediction of NP in cancer care.
Imaging and radiomics
DL models utilizing imaging and radiomics were instrumental in predicting NP by extracting complex features from MRI, PET, and CT scans. 7 demonstrated that models such as 3D U-Net, DeepMedic, and V-Net achieved segmentation accuracies with Dice scores up to 0.90 when localizing brain tumors and predicting neuropathic risks based on markers like MGMT methylation and IDH mutation status. Similarly, Lin et al. 9 reviewed MRI-based AI applications in post-neoadjuvant breast cancer care, where radiomic models reached AUCs ranging from 0.77 to 0.99 and sensitivities up to 0.88 in predicting treatment outcomes and complications related to NP. These models highlight the value of radiomics in capturing anatomical and molecular patterns associated with
Behavioral and affective cues
Behavioral and affective cues were effectively leveraged by AI models to infer NP states, particularly through speech prosody and facial expression analysis. Cascella et al. 11 employed a multimodal AI framework combining speech emotion recognition and facial action units, achieving an overall accuracy of 84% and F1-scores as high as 95% for detecting sadness, one of the most indicative emotional states linked to pain. In patients with cancer-related NP, negative emotions such as anger (20%) and sadness (17%) were significantly more prevalent compared to non-pain cases, where neutral (56%) and surprise (41%) dominated, underscoring the role of emotional profiling in personalized pain assessment.
Pain assessment instruments used
Across the 14 included studies, a variety of standardized pain assessment instruments were employed to evaluate NP and related outcomes, ensuring consistency and reliability in pain measurement. The NRS was the most frequently utilized, appearing in 10 studies, particularly in postoperative contexts to capture acute and chronic pain intensity, such as in Sun et al., 5 where NRS scores within 48 h post-surgery served as strong predictors of CPSP. The BPI, used in 7 studies, provided a multidimensional evaluation of pain severity and interference with daily activities and was instrumental in studies like Salama et al. 1 and Antel et al. 6 to benchmark AI model outputs. The VAS was also commonly applied in postoperative and spine-related pain contexts, as demonstrated in Wirries et al., 17 who used VAS in conjunction with Oswestry Disability Index (ODI) to train decision tree models for predicting long-term outcomes following lumbar disc herniation. The DN4 questionnaire, a validated tool for distinguishing neuropathic from nociceptive pain, was used in Juwara et al. 4 to train ML models in breast cancer patients’ post-surgery. Additionally, a modified version of the BPI was adopted by Sun et al. 5 to define CPSP outcomes at 12 months, enabling nuanced classification for ML training and evaluation. These tools played a critical role in standardizing data input for AI models, enhancing comparability and predictive accuracy across studies.
Model performance metrics
Discrimination (AUC, accuracy, RMSE)
Discrimination performance across the included studies was primarily assessed using the AUC, accuracy, and RMSE, highlighting the predictive strength of various AI models. RF models consistently demonstrated superior discriminatory power, with AUCs reaching as high as 0.94 in studies like Guan et al. 18 and a median AUC of 0.81 across studies reviewed by Salama et al.. 1 SVMs showed robust performance with AUC values ranging between 0.808 and 0.87, while NNs achieved comparable results with AUCs around 0.83 to 0.86. LR, typically used as a baseline comparator, showed more modest discrimination, with AUCs ranging from 0.63 to 0.78. RMSE was used in regression-based models to evaluate prediction error, with Juwara et al. 4 reporting values between 1.16 and 1.50 for post-surgical pain prediction, and Kumar et al. 22 achieving lower RMSEs of 0.127–0.132 for fentanyl dosage prediction, indicating high precision in postoperative pain outcome forecasting.
Calibration and external validation
Calibration and external validation were notably underreported across the reviewed studies, raising concerns about the real-world readiness of the AI models. Only 5% of studies provided calibration metrics, such as the LR slope and intercept used by Juwara et al., 4 which showed a slope of 1.00 and an intercept of 0.03, indicating a good model fit. External validation was performed in just 14% of studies, including notable examples by Salama et al. 1 and Guan et al., 18 who tested model generalizability on independent cohorts. Despite promising performance metrics, only 23% of the studies demonstrated potential for clinical integration, highlighting the need for improved validation practices and real-world implementation strategies.
Feature importance and interpretability
Feature importance and interpretability analyses across the reviewed studies highlighted several recurring predictors of NP in oncology. Clinically, acute postoperative pain, anxiety levels (measured via HADS), and type of surgery emerged as consistent top features, particularly in models Sun et al. 5 and Juwara et al. 4 On the molecular level, biomarkers such as SA1P and sphingomyelin 33:1, identified by Lötsch et al., 3 as well as MGMT methylation status reported by Khalighi et al., 7 were strongly associated with pain prediction. Emotionally, Cascella et al. 11 found that sadness and anger were dominant features in AI models using facial and speech emotion analysis, offering novel insight into affective predictors of pain experience.
Key thematic findings across studies
Biological markers for neuropathic pain
Biological markers such as SA1P, lysophosphatidylcholines, and sphingomyelins were identified as significant predictors of CIPN in lipidomic studies like that of Lötsch et al. 3 Using ML models, including RF and SVM, the study achieved up to 90% balanced accuracy in classifying pre- and post-paclitaxel plasma samples. In vitro assays confirmed the functional role of SA1P in activating TRPV1 channels and S1P receptors in 11.7% of sensory neurons, offering pathophysiological validation for AI-generated predictions.
Psychological and emotional predictors
Psychological variables such as anxiety, depression, and emotional dysregulation consistently emerged as strong predictors of NP across multiple studies. Juwara et al. 4 and Sun et al. 5 demonstrated that including HADS scores and emotional health indicators improved the prediction of CPSP. Cascella et al. 11 utilized AI models based on speech and facial expressions, where sadness was predicted with a 95% F1-score, showing that affective states can be quantified and modeled to enhance the prediction of pain-related outcomes.
Multimodal data fusion enhances accuracy
Studies integrating diverse data modalities—such as genomic, clinical, imaging, and behavioral—reported superior predictive accuracy. Juwara et al. 4 and Sun et al. 5 demonstrated that combining psychometric and clinical predictors yielded better model calibration and discrimination (e.g., AUCs of 0.749–0.755 for ML models vs. 0.631 for LR). Cascella et al.'s 11 multimodal emotion analysis framework achieved 84% overall accuracy, underscoring the value of holistic patient profiling in pain prediction
Potential for clinical integration
Although some models have demonstrated translational potentials, such as lipid-targeted interventions like fingolimod or radionics-driven risk stratification, the overall clinical integration remains limited. Only 23% of studies reported real-world applications, 14% included external validation, and just 5% assessed model calibration. This highlights the need for standardized protocols, transparent algorithms, and broader validation efforts to transition AI tools from research to routine oncology pain management.
Discussion
This systematic review highlights the increasing integration of AI and ML into oncologic pain research, with a particular focus on NPNP prediction. In the current review, fourteen articles met the inclusion criteria and were published between 2020 and 2025.
The main findings and synthesis of the reviewed studies show that AI models, particularly RF, SVM, and NNs, perform well in identifying patients at risk of NP after cancer treatments such as surgery, chemotherapy, and radiation. Notably, the use of AI allows for the identification of non-linear, complicated interactions between clinical, psychological, and biological variables that traditional statistical methods may miss. Furthermore, several studies consistently found that RF and SVM had higher predictive capacity, with AUC values frequently above 0.80 and positive predictive values exceeding 70–80% in specific scenarios (e.g., Cascella et al. 11 and Sun et al. 5 ). This emphasizes AI's capacity for robust risk stratification and individualized prediction in NP management. In terms of biological and molecular insights, one of the most compelling contributions comes from studies like, Lötsch et al. 3 which uniquely integrate lipidomics with ML, uncovering SA1P and sphingomyelins (33:1 and 43:1) as significant biomarkers of CIPN. Therefore, this not only supports the emerging literature linking sphingolipid metabolism to NP pathophysiology but also offers a biologically interpretable ML output—a crucial step toward clinical trust and translational application. In vitro validation further confirmed the biological relevance of these findings, showing that SA1P activates TRPV1 channels and S1P receptors, known mediators in nociceptive signaling. Considering the psychological and behavioral determinants, the importance of psychological factors—especially anxiety, depression, and postoperative acute pain—was underscored in studies like Juwara et al., 4 Sun et al. 5 and Sipilä et al. 10 Emotional states emerged not only as correlates but also as predictors of NP, particularly CPSP. This supports biopsychosocial models of pain and suggests that AI-driven emotional and behavioral profiling, as demonstrated in Cascella et al., 11 can enhance real-time pain assessment and monitoring. Furthermore, multimodal models combining speech emotion recognition, facial expression analysis, and psychometric data were able to infer emotional burden and pain presence with accuracies as high as 84%, showing the potential for holistic, patient-centered AI tools in oncology care. In terms of imaging and radiomics applications, AI models utilizing MRI, PET, and radiomic features demonstrated high performance in predicting pain-related outcomes, tumor-induced neuropathy, and molecular pain biomarkers (e.g., MGMT methylation and IDH mutations). As shown in Khalighi et al. 7 and Lin et al., 9 radiomic-based DL models achieved AUCs ranging from 0.85 to 0.99, exceeding standard imaging interpretation. These models have the potential for preventative treatments, surgical guidance, and treatment planning to reduce nerve injury. Despite methodological differences, the majority of studies showed acceptable model discrimination but lacked calibration (only 5% reported slope/intercept measurements) and external validation (14%). As such, current findings, while promising, should be interpreted with caution. Moreover, only 23% of the studies showed evidence of clinical implementation, revealing a significant translational gap between AI development and its use in real-world oncology settings. Another limitation is the heterogeneity in assessment tools, input variables, and outcome definitions, which hampers meta-analytic synthesis and model generalizability. Standardization in data collection, preprocessing, and pain measurement is essential moving forward. A recurring concern across studies is the lack of transparency in model decision-making—a major barrier to clinical adoption. While some models (e.g., RF) allow variable importance analysis, others (e.g., deep NNs) are inherently opaque. Model explainability using techniques like SHAP values, LIME, or cABC analysis used by Lötsch et al. 3 should become standard to ensure clinical trust. Furthermore, SA1P may have a role in paclitaxel-induced biochemical alterations that lead to neuropathic side effects. The found SA1P receptors could be a therapeutic target for co-therapy with paclitaxel to minimize one of its significant and therapy-limiting adverse effects. 23 In addition to, Seth et al., 24 they focus on refining the prediction of pain outcomes post-breast cancer surgeries, and they concluded that additional research is needed to advance AI models from the experimental stage to actual implementation in healthcare. In the digital age with AI's rapid progress, new applications, such as direct breast surgery performance, are envisaged in the next years. Breast surgeons and healthcare professionals should stay up to date on breakthroughs in AI applications in breast surgery to deliver the best possible treatment to their patients. Finally, Ethical concerns also arise regarding bias, especially in studies with underrepresentation of minority populations or LMICs. Future research must prioritize diverse datasets, open science practices, and ethical frameworks to ensure AI applications do not exacerbate disparities in cancer pain care.
Implications for policy, practice, and research
The policy, clinical practice, and research implications of this systematic review underscore the transformative potential of AI in improving NP prediction and management in oncology. Clinically, AI models facilitate early identification of patients at high risk for NP even before treatment begins, enabling timely and targeted interventions. Integration with electronic health records could further support personalized pain management strategies, optimizing pharmacologic and psychological therapies. Moreover, tools like those developed by Cascella et al. demonstrate how multimodal AI—incorporating facial and speech emotion analysis—can provide real-time, remote pain monitoring in a non-invasive manner. On the research front, there is an urgent need for prospective, multicenter trials with standardized data and rigorous external validation to ensure reproducibility and generalizability of findings. The development of explainable, hybrid AI models that combine omics, clinical, and behavioral data is essential to enhance model transparency and clinical trust. Additionally, current research is heavily skewed toward breast cancer populations; therefore, expanding AI applications to underrepresented cancer types such as head and neck, colorectal, and hematologic malignancies is critical to improving equity and comprehensiveness in pain management.
Limitations of this review
This review is limited by the heterogeneity of the included studies in terms of design, data modalities, AI models, and pain definitions, which precluded formal meta-analysis. Additionally, some included narrative reviews and exploratory studies may lack the empirical rigor of RCTs or large observational cohorts. Finally, despite a comprehensive search, publication bias cannot be ruled out, as studies with non-significant findings are less likely to be published.
Future directions
Future directions in the application of AI for NP prediction in oncology emphasize the critical need for methodological and technological advancements to bridge the gap between research and real-world implementation. Standardization of outcome metrics, data formats, and pain classification systems is essential to ensure consistency, comparability, and reproducibility across studies. The development of explainable AI (XAI) is paramount to enhancing model transparency, clinician trust, and regulatory acceptance, especially in high-stakes clinical decision-making. Federated learning frameworks offer promising avenues for secure, decentralized model training across institutions, allowing for robust model generalization while preserving patient privacy. To support global applicability, there is also a pressing need for AI tools that are multilingual and culturally sensitive, particularly for deployment in LMICs. Finally, embedding AI into digital health platforms—including mHealth applications, wearable biosensors, and telemonitoring systems—can facilitate continuous, real-time assessment and personalized management of NP, marking a significant step forward in precision oncology and patient-centered care.
Conclusion
In conclusion, this systematic review underscores the growing potential of AI and ML in predicting NP among cancer patients, offering promising accuracy, especially through models like RF, SVM, and DL. Multimodal data integration—including clinical, emotional, imaging, and molecular inputs—enhanced model performance and relevance. However, limitations such as insufficient external validation, lack of calibration, and poor interpretability continue to hinder clinical implementation. Future efforts must focus on methodological standardization, XAI development, and inclusive datasets to realize the full potential of AI in advancing personalized pain management in oncology.
