Sage Journals: Discover world-class research

Abstract

Purpose

Synthetic data has emerged as a promising solution to overcome the shortage of clinical datasets needed for training healthcare artificial intelligence (AI) models. This study examined how synthetic data can support AI development in Africa's healthcare by analyzing its technical performance, fidelity limitations, and governance implications within low-resource health systems.

Methods

A Critical Literature Review was conducted on scholarly and technical literature focused on the use of synthetic data for AI in healthcare across African settings. Databases searched included Scopus, Web of Science, PubMed, and Google Scholar. Thematic analysis identified trends in synthetic data generation, fidelity, domain adaptation, and adoption challenges in African healthcare AI.

Results

Drawing on interdisciplinary evidence, the analysis demonstrates how addressing technical challenges, improving synthetic data fidelity, leveraging domain adaptation techniques, and confronting practical adoption barriers are critical to enhancing the reliability and applicability of synthetic data for AI-driven healthcare in Africa. Four themes emerged from the analysis. First, hybrid synthetic–real datasets consistently outperform synthetic-only models. Second, fidelity gaps introduced bias risk and misclassification. Third, domain adaptation remains underused in low-resource contexts. Fourth, infrastructure gaps, weak regulation, and clinician skepticism hindered the adoption of synthetic data.

Conclusion

Synthetic data can enhance AI-enabled healthcare in Africa if it is embedded within regulatory frameworks, validated through hybrid modeling, and supported by investment in infrastructure and capacity-building. This study highlights the intersection of synthetic data, healthcare AI, data fidelity, domain adaptation, and governance considerations in African health systems, underscoring the need for robust health technology assessment processes.

Keywords

Synthetic data artificial intelligence healthcare policy eHealth systems health technology assessment Africa

Introduction

Artificial intelligence (AI) has emerged as a groundbreaking innovation with transformative potential across diverse industries, including healthcare. Its integration into medical practices has catalyzed significant advancements in diagnostics, personalized medicine, and patient care management, underscoring its versatility and impact.^1,2 AI-driven tools excel at processing and analyzing vast datasets, uncovering patterns that often surpass human capabilities in specific medical tasks. For example, a study by Kim et al.³ demonstrated that an AI algorithm trained on extensive mammography data achieved superior diagnostic accuracy in detecting breast cancer compared to traditional radiologists. Similarly, research has shown that AI systems can significantly reduce false positives and negatives in mammogram interpretations, further enhancing diagnostic precision.⁴ Comparable breakthroughs have been documented in the detection of pneumonia through chest radiographs,⁵ cardiac arrhythmias,⁶ diabetic retinopathy,^7,8 and skin cancer.^9,10 Beyond diagnostics, AI-powered healthcare systems have proven highly effective in formulating evidence-based treatment plans. By synthesizing patient records and medical literature, these systems provide tailored treatment recommendations that often align with or exceed the expertise of human clinicians. IBM's Watson for Oncology, for instance, leverages vast clinical datasets to deliver personalized cancer treatment options, achieving high levels of agreement with oncologists’ recommendations.¹¹ Studies further underscore AI's role in optimizing treatment strategies for various conditions, improving patient outcomes, and reducing medical errors.^12,1 Moreover, AI is reshaping hospital operations by automating administrative tasks, enhancing efficiency, and allowing healthcare providers to dedicate more time to patient care, thereby improving the overall quality of healthcare delivery.^13,14

In Africa, the adoption of AI in healthcare, although slower than in other regions, is gaining momentum through innovative initiatives, such as leveraging AI to revolutionize cancer diagnostics through cervical cell image analysis, tissue evaluation, disease prediction, and teleradiology.¹⁵ Research has also highlighted AI's efficacy in diagnosing and predicting cervical cancer, colorectal cancer, leukemia, breast cancer, lung cancer, and pulmonary tuberculosis.^15,16 In Tanzania, Mwanga et al.¹⁷ introduced an innovative approach to malaria screening by integrating the Logistic Regression model with mid-infrared (MIR) spectroscopy, utilizing human dried blood spots (DBS) as the primary sample medium. This method leverages the unique MIR spectra captured by a spectrometer and trains a model to classify DBS samples as either malaria-positive or malaria-negative. The proposed method is promising for enabling rapid, high-throughput screening of Plasmodium infections, making it applicable in both clinical and nonclinical environments. In obstetric care, a clinical trial led by Adedinsewo et al.¹⁸ in Nigeria demonstrated the utility of AI-powered electrocardiogram analysis for identifying cardiomyopathy in pregnant and postpartum women. Other notable examples include AI tools for screening vision-threatening diabetic retinopathy in Zambia⁸ and minimizing diagnostic errors stemming from the overwhelming patient-to-doctor ratio in Rwanda.¹⁹ These AI-driven innovations hold particular promise for healthcare systems in low-resource environments, where they can enhance diagnostic accuracy, alleviate the burden of high disease prevalence, and improve overall healthcare outcomes.

Despite these promising initiatives, the widespread adoption of AI in healthcare remains limited in Africa due to several persistent challenges, including data scarcity. The shortage of curated clinical datasets limits both the training of robust AI models and their translation into practice in low-resource settings. This scarcity undermines progress toward the African Union's Continental AI Strategy targets and the WHO Global Strategy on Digital Health, both of which emphasize data quality, accessibility, and equity. Specifically, this article focuses on the issue of limited access to high-quality clinical data, which significantly impedes the development and deployment of AI models across the continent. While there are notable advancements that illustrate the growing global reliance on synthetic data to overcome clinical data limitations, Africa's engagement with this emerging approach remains underexplored. To this end, this study examines the potential of synthetic data for healthcare AI in Africa with a dual focus on technical and governance dimensions. The contributions of this study are threefold: first, it synthesized evidence on how synthetic data performs across technical dimensions such as fidelity, generalization, and domain adaptation in African healthcare contexts. Second, it examined governance, regulatory readiness, and institutional constraints that influence the safe use of synthetic data in low-resource settings. Third, it developed a conceptual framework that integrates technical and policy considerations to guide future evaluation, deployment, and oversight of synthetic data-based healthcare AI systems.

The remainder of this article is organized as follows. The second section examines the nature of clinical data scarcity within African health systems and provides an overview of synthetic data. The third section outlines the study's methodology, including the review design, search strategy, and selection criteria. The fourth section presents the study's results, while the fifth section offers a detailed discussion of the technical findings, adoption barriers, and existing knowledge gaps. Finally, the sixth section concludes the article with policy recommendations for the safe and effective integration of synthetic data into healthcare AI in Africa.

Data scarcity and application of synthetic data

Clinical data scarcity

Clinical data serves as the foundation for machine learning (ML) algorithms to create computational models that generate predictive insights for informed medical decision making.²⁰ Nevertheless, limited access to such data poses a significant obstacle to the practical application of ML tools in healthcare.²¹ Africa, in particular, faces substantial challenges due to low levels of digitization, which limit the availability of locally generated clinical data, a crucial resource for developing AI-driven healthcare solutions.²² For example, many healthcare facilities in sub-Saharan Africa still rely on fragmented health information systems, ranging from paper-based records to isolated digital platforms. This fragmentation leads to inconsistencies in data quality and a lack of interoperability between different systems.^23,24 Notably, the paucity of health data is a significant barrier to Africa's adoption of digital health solutions and evidence-based clinical practices.²⁵ Although Africa faces the highest burden of both communicable and noncommunicable diseases worldwide, it accounts for only about 1% of global research output. This disparity is partly due to challenges in accessing and maintaining medical data.²⁶ The AU Continental Artificial Intelligence Strategy 2024 ²⁷ highlights a significant gap in the inclusiveness, quality, and availability of data to support African AI innovations. The strategy entails several actions to enhance data availability on the continent, including cross-border data sharing among AU member states. Moreover, it is pertinent to note that even when clinical data is available, a significant portion remains underutilized due to issues related to data complexity, storage, and poor management systems.²⁸ This challenge is further intensified by the prevalence of imbalanced and low-quality data, which not only hampers the effective implementation of ML models^29,30,31 but also amplifies issues such as model underfitting, overfitting, and the risk of biased outcomes.^32,15 Data scarcity and incompleteness have been reported in several studies,^30,33,34 demonstrating how it can lead to reliance on algorithms predominantly developed outside Africa.³⁵ Deploying such externally developed tools in new settings can result in several issues, including poor model performance, biased outcomes, and misaligned clinical relevance. Some notable examples highlighting the impact of data unavailability include the study by Mwanga et al.¹⁷ While their proposed AI-driven malaria screening tool demonstrated significant potential, its real-world application was hindered by the lack of diverse and representative datasets, which are essential for comprehensive validation and refinement. In addition, Manson et al.³⁶ emphasize that the limited availability of sufficient radiological images has been a major obstacle to the adoption of AI in radiotherapy, as these images are critical for training deep learning models.

Synthetic data

Synthetic data has emerged as a solution to address the challenges of limited or inaccessible real-world data.³⁷ In healthcare, real-world data can be costly, difficult to access, or pose privacy risks. Synthetic data is artificially created to mimic the patterns and properties of real data. Synthetic data generation enables the creation of high-quality, diverse datasets for training and validating machine learning models, providing a valuable resource for healthcare applications where data availability is often limited. For instance, to address the scarcity of chest X-rays and brain magnetic resonance imaging (MRIs), critical for diagnosing respiratory and neurological diseases, respectively, Dhawan and Nijhawan³⁸ developed a specialized Generative Adversarial Network (GAN) architecture to generate synthetic imaging data. By integrating synthetic data with real datasets, their approach significantly improved classification performance, achieving up to 85.9% accuracy on a brain MRI task using the EfficientNet v2 model. Moreover, Montes et al.³⁹ demonstrated that deep generative models, such as GANs and Restricted Boltzmann Machines, can effectively learn and replicate complex genomic data distributions, generating high-quality artificial genomes (AGs) with minimal risk to privacy. Their findings demonstrate that AGs preserve key genetic characteristics, enhance imputation accuracy, and support supervized learning tasks in genomic studies. Schmitz et al.⁴⁰ developed SynthMD, a hierarchical data generation tool that produces synthetic datasets for rare diseases using publicly available statistics, addressing the critical data scarcity challenge in rare disease research. Their open-source approach enables researchers to simulate patient data for conditions such as Sickle Cell Disease, Cystic Fibrosis, and Duchenne Muscular Dystrophy, facilitating method development and the development of privacy-preserving innovations. These examples demonstrate the value of synthetic data in developing healthcare AI models when real data is limited.

Various techniques are employed to generate synthetic data, including GANs, simulation models, and data augmentation methods. GANs work by training two neural networks against each other: a generator that creates synthetic data, and a discriminator that evaluates how closely the synthetic data matches real data.⁴¹ Over time, the generator improves its ability to produce realistic data, making it highly effective for generating synthetic images, texts, and other data types. In healthcare, GANs have been utilized to generate synthetic medical images, including computed tomography (CT), MRI, and X-ray scans, which are crucial for training deep learning models for disease detection.³⁸ These synthetic medical images can be created from limited real-world data, reducing the need for extensively annotated datasets that they often expensive and tedious to produce.

On the other hand, simulation models are widely used to generate synthetic data by producing datasets based on specified rules, parameters, and variables.⁴² In healthcare, simulation models can create synthetic patient records, medical histories, and even time-series data, such as cardiac blood volume pulse.⁴³ With such models, researchers can simulate complex patient scenarios that would be difficult or impractical to observe in real life. Finally, data augmentation techniques are employed to replicate real-world phenomena, including variations in object and scene appearance resulting from changes in pose, viewpoint, lens distortion, deformation, and other camera-related artifacts.⁴⁴ This is particularly useful in image-based tasks, where augmenting a limited dataset with slightly altered images can provide a more extensive, varied dataset for training AI models. For example, medical image augmentation can help train AI models for diagnostic purposes by generating variations of existing scans, allowing the model to learn from a broader range of conditions.⁴⁵ Readers seeking an in-depth analysis of synthetic data generation methods in healthcare are encouraged to consult the study by Pezoulas et al.⁴⁶

The practical applications of synthetic data have been demonstrated globally. For example, in Europe, the EU-funded SYNTHIA project exemplifies large-scale synthetic data integration, generating multimodal datasets that include imaging, genomics, and clinical notes to train AI models for diseases such as lung cancer, Alzheimer's disease, and type 2 diabetes.⁴⁷ In Germany, the NFDI4Health initiative has developed SYNDAT, a web-based tool that utilizes AI models, such as VAMBN and MultiNODEs, to generate and assess synthetic health data. It enables the evaluation of data utility and privacy risks, with practical applications demonstrated using real-world datasets for Alzheimer's disease and cancer.⁴⁸

Despite its promise, synthetic data presents several challenges that must be addressed to ensure its effectiveness in real-world healthcare applications. Key concerns include maintaining the authenticity and fidelity of generated data, minimizing algorithmic bias, and rigorously validating models trained on synthetic datasets.⁴⁶ Additionally, ethical considerations, particularly regarding privacy, fairness, and transparency, remain crucial in the deployment of synthetic data within sensitive domains, such as healthcare.⁴⁹ A recent study proposing an evaluation framework for synthetic data in health noted that ensuring a robust fidelity-utility-privacy tradeoff is challenging, as synthetic data generators must balance realism with privacy guarantees.⁵⁰ Kaabachi et al.⁵¹ suggest that there is no universally accepted set of benchmarks or evaluation metrics, which hampers the validation, comparison, and adoption of synthetic datasets at scale. Moreover, synthetic health data can reproduce or even amplify biases present in the original data, raising concerns about fairness.⁵²

Methodology

Using a Critical Literature Review (CLR), this study interrogated relevant scholarly and technical literature focused on the use of synthetic data for AI in healthcare across African settings. The review was guided by three research questions: (1) How has synthetic data been used to support healthcare AI development in Africa and comparable low-resource settings? (2) What technical performance issues arise concerning fidelity, generalization, and domain adaptation? (3) What governance, regulatory, and institutional factors influence the adoption and safe use of synthetic data in African healthcare? Policy-relevant studies, including those discussing regulation, governance, and health technology assessment, were included to align the analysis with health policy priorities. Rather than simply summarizing findings, the CLR method provided a systematic process to reveal conceptual assumptions, empirical inconsistencies, and research gaps.^53,54 The approach also enabled an assessment of technical claims and socioinstitutional barriers, which are often overlooked in traditional narrative reviews.^54,55

A systematic search was conducted across Scopus, Web of Science, PubMed, and Google Scholar, targeting peer-reviewed journal articles, technical reports, and institutional publications. The following search string was applied across all databases for consistency and transparency: (“synthetic data” AND (“healthcare” OR “clinical”) AND (“artificial intelligence” OR “machine learning”) AND (“Africa” OR “low-resource” OR “low-income”)). Search terms used to identify articles related to (i) synthesized data, healthcare AI, (ii) Africa, (iii) domain adaptation, and (iv) data scarcity. The search covered literature published between 2010 and 2025, reflecting the period during which synthetic data approaches gained relevance in healthcare AI. We included studies based on their relevance to synthetic data generation, model training processes, and adoption in low-resource healthcare systems.

The quality and relevance of retrieved literature were assessed through predefined inclusion and exclusion criteria focusing on methodological rigor, clarity of synthetic data evaluation, and applicability to healthcare AI in low-resource settings. Studies were further appraised using a structured checklist that examined credibility, transparency of methods, and alignment with the review's objectives to ensure that only robust and contextually relevant evidence was included. The inclusion criteria were: (1) studies involving synthetic data for healthcare or clinical AI; (2) studies presenting technical performance outcomes; (3) research relevant to Africa or other low-resource health systems; and (4) peer-reviewed articles or high-quality institutional reports. Exclusion criteria were: (1) studies unrelated to healthcare; (2) papers lacking methodological detail on synthetic data generation; (3) commentaries without empirical or technical content; and (4) non-English publications. Insights were synthesized using thematic analysis techniques, allowing for critical engagement with the dominant discourses present within them and the situational applicability of these discourses.

The review followed Critical Literature Review principles, combined with PRISMA (see Figure 1) guidance, for documenting the search, screening, and inclusion processes.

Figure 1.

PRISMA flow diagram of the search process for the studies for review (source: authors’ construct).

Results

Technical performance and model reliability

Synthetic data appears to be a promising solution to address the issue of insufficient data in healthcare AI; however, a thorough analysis reveals significant performance limitations.^64,65 Many models developed solely from synthetic data showed significantly lower generalizability when tested for clinical use, especially in heterogeneous, low-resource settings.^57,56 Generative techniques, such as medGAN, yielded more realistic synthetic data, but fidelity gaps remained, impacting generalizability and, ultimately, the reliability of diagnostics. The difference is crucial because it highlights the need to recalibrate expectations regarding training with synthetic data only. The prominence of GAN-based methods in Table 1 reflects the current state of published research rather than reviewer or author bias. GANs remain the dominant approach used in medical synthetic data generation, which explains their strong representation in the reviewed studies.

Table 1.

Summary of key studies on synthetic data use in healthcare AI (source: authors’ construct).

SN	Literature	Method	Application	Key findings	Limitations
1	56	GAN (medGAN)	EHR Data Generation	Generated discrete patient records with moderate fidelity	Limited generalizability in clinical use
2	57	Enhanced GAN	EHR Model Training	Hybrid models performed better than synthetic-only	Fidelity gaps in high-risk patients
3	58	GAN for medical images	Data Sharing & AI Training	Synthetic images enabled privacy-preserving training	Quality inconsistency across image types
4	59	Probabilistic models	Evaluation of ML tools	High-fidelity synthetic data approximated real data	Limited to structured clinical data formats
5	60	GANs, Diffusion Models	MRI Segmentation	Models trained on synthetic MRI achieved 80–90% accuracy	Risk of data memorization with small real datasets
6	61	Generative Augmentation	Fairness in Image Classifiers	Improved performance on underrepresented groups	High-resource environment assumptions
7	62	Domain Randomization	X-ray AI Training	Maintained accuracy despite domain shifts	Requires 3D imaging resources
8	63	GAN Augmentation	Predictive Modeling	Synthetic data outperformed SMOTE in a low-resource context	Model stability not tested beyond the ICU domain
9	46	Mixed Techniques	Methodological Review	Open-source tools are promising, but need standardization	Lack of fidelity benchmarks

3D: three-dimensional; AI: artificial intelligence; EHR: electronic health records; GAN: Generative Adversarial Networks ; ICU: intensive care unit; ML: machine learning; MRI: magnetic resonance imaging.

Significantly, the most common method used by studies has been to assess models using (often overly optimistic) internal synthetic test sets.⁶⁰ Only a fraction of the studies in our review, summarized in Table 1, compared synthetic-only with synthetic–fundamental hybrid training approaches. In all of these comparisons, models incorporating both synthetic and real data consistently outperformed pure real-only and synthetic-only models in terms of generalization (predictive performance on held-out data) and reduced errors in prediction within underrepresented cohorts, such as specific ethnicities and ethnic groups. Even in the presence of such evidence, hybrid modeling continues to be underutilized in the African context of health AI projects, signaling a lack of coherence between technical advancements and the processes applied towards implementation. These findings have direct policy implications for health technology assessment (HTA) processes, suggesting that regulatory bodies should require hybrid validation protocols before approving synthetic data-based AI tools for clinical use.

Current techniques for generating synthetic healthcare data include generative adversarial networks, diffusion models, probabilistic models, simulation-based generators, transformer-based language models for clinical text, and classical augmentation pipelines. These techniques produce structured records, medical images, clinical notes, and time-series signals, each with varying fidelity and resource requirements. Recent advances in diffusion models and multimodal generators have expanded the range of synthetic data modalities available for clinical AI development.

Availability of synthetic datasets varies globally. High-income regions host numerous open-source repositories containing synthetic medical images, structured clinical records, and genomic datasets. In contrast, Africa has very limited publicly available synthetic datasets due to infrastructure constraints, fragmented health information systems, and restrictive data governance environments. The lack of shared synthetic datasets impedes benchmarking, validation, and model reproducibility in African healthcare AI projects.

Data fidelity and risks of bias

Data fidelity emerged as a decisive factor influencing the clinical utility of synthetic datasets. In this review, fidelity is defined by two dimensions. First is distributional fidelity, which captures how closely the synthetic data match the overall statistical properties of real datasets. Second is clinical fidelity, which assesses whether rare events, comorbidity patterns, and sociodemographic variations are represented realistically. Both dimensions are essential for safe clinical deployment, as low fidelity increases the risk of biased predictions and diagnostic misclassification. However, the fidelity limitation was significant across many studies, where generative models accurately reproduced general features of new patients but failed to capture comorbidities, rare conditions, and sociodemographic variations.^58,59 This is likely to have increased the risk of diagnostic bias and misclassification.

Globally, key challenges include inconsistent fidelity across modalities, limited validation protocols, unclear privacy guarantees, and the absence of benchmarking standards. In Africa, these challenges are intensified by weak digital infrastructure, limited computational capacity, scarce local datasets, and complex regulatory environments. These factors combine to reduce the practical utility of synthetic data in resource-limited clinical settings, particularly where rare conditions and diverse comorbidity patterns remain underrepresented.

In several contexts with limited or incomplete baseline datasets, generative models constructed from incomplete records perpetuated systemic bias instead of countering it, raising concerns about fidelity.⁶⁰ In Table 1, we demonstrate that for state-of-the-art applications of synthetic datasets, certain limits of fidelity performance remain constant. Such would necessitate standardized fidelity assessment metrics from now on. Importantly, critical systems such as healthcare systems must ensure that they do not release synthetic-data-driven AI for patient care purposes without proper validation methods that are specific to their use-case configuration.

Domain adaptation for low-resource health systems

Domain adaptation techniques received increasing attention as a bridge between synthetic and real-world environments. Synthetic data remained unable to be directly bridged to the real world due to the domain gap; thus, contextual adaptation techniques began to gain additional focus. Transfer learning, adversarial fine-tuning, and domain randomization approaches have also been used to increase model robustness in this scenario by addressing the distribution shift.^62,61 Such techniques facilitate better diagnostic fairness and allow classifiers to generalize across historically underrepresented population segments in training datasets. Nonetheless, only a few studies have empirically explored domain adaptation in resource-constrained scenarios. Notably, Zhu et al.⁶³ demonstrated that their research was effective in an low and medium income countries using only a small amount of real data for recalibration, provided the recalibration threshold was minimal. Beyond imaging and structured electronic health records (EHR)s, synthetic data have also been applied to clinical notes, diagnostic reports, and time-series data such as electrocardiogram, photoplethysmography, and intensive care unit monitoring signals. These modalities introduce additional challenges due to temporal dependencies, linguistic variation, and the need to preserve clinical coherence. Emerging transformer-based models and physiologically grounded simulators have shown promise in generating higher-fidelity text and time-series data suitable for downstream clinical tasks. Domain adaptation, then, in theory, would work. Because the capacities of those health systems are roughly the same, there are hints of applicability when discussing the implementation of a model in a new region or population; however, unfortunately, in Africa, those health systems lack these capacities. Hence, the direct application is at best problematic. Achieving adaptation benefits requires policymakers to facilitate targeted investments in infrastructure and capacity building.

Adoption barriers and policy readiness in Africa

While there have certainly been strides in technology, structural challenges to adoption remain. The lack of adequate infrastructure, particularly erratic electricity, limited data storage capacity, and poor internet access, will still prevent the use of synthetic data in many African health systems.⁶⁹ Such restrictions hinder the generation and effective deployment of AI systems. Moreover, as shown in Table 2, institutional fragmentation and inadequate collaboration between health and information and communication technologies stakeholders hinder the implementation. Shortages in skills and human capital compound these.⁷⁰ Existing studies have made assumptions about high technical capabilities for model deployment and domain adaptation; neither assumption is a common characteristic across continents.⁶¹ Ethical concerns also complicate the adoption process. Without rigorous windows of opportunity for validation, synthetic data-based models could operate with clinical freedom, without oversight or accountability. In particular, the lack of transparency in model development is one of the many reasons behind clinician skepticism.⁶⁸ On the path to making it the norm, synthetic data must be embedded in a broader digital health approach with explicit regulatory clarity and stakeholder alignment.

Table 2.

Adoption barriers of synthetic data in African healthcare contexts (source: authors’ construct).

Barrier type	Description	Supporting insight
Technical	Limited computational infrastructure for synthetic data generation	Requires high-performance computing⁶²
Institutional	Weak coordination among stakeholders and fragmented digital health systems	Low cross-sector integration impedes adoption⁶⁶
Human capacity	Shortage of data scientists and clinical informatics expertise	Adaptation methods demand high technical skill⁶¹
Ethical	Absence of regulatory frameworks and validation protocols	Risks of biased models and a lack of accountability⁶⁷
Trust & acceptance	Clinicians are hesitant to rely on synthetic data for decision support	Concerns over reliability and data provenance⁶⁸

Evaluation of synthetic data

To support safe and responsible use, a practical framework for evaluating and releasing synthetic healthcare data is essential. Generally, when evaluating synthetic data, the most common aspects that are examined include Fidelity, Utility, and Privacy.⁷¹ Fidelity is evaluated by examining how closely synthetic data matches real data at both the individual-feature level and across multiple features or sequences. First, feature-wise fidelity can be assessed using descriptive statistics, such as mean, variance, and range, to ensure basic alignment between the real and synthetic distributions. This can be supplemented with statistical drift measures and Goodness-of-Fit tests, such as the Kolmogorov–Smirnov test, which provide statistical evidence on how well each feature's distribution is reproduced. For a more quantitative measure, a classifier-based test can be employed: a model is trained to distinguish between real and synthetic data, where high accuracy indicates poor fidelity (the data are easily distinguishable), while accuracy near 50% suggests strong fidelity.⁷² The recent study by Adams et al.⁷³ evaluated fidelity using three complementary measures. The Discrimination Score assesses how well a classifier can distinguish synthetic data from real data, lower discriminability indicates higher fidelity. The Distribution Score compares marginal feature distributions between real and synthetic datasets to determine how closely individual feature statistics are replicated. Finally, the Correlation Score assesses whether the synthetic data accurately preserves the pairwise relationships between features, indicating how well the structural dependencies of the real data are maintained. Other notable metrics used include the Hellinger distance to quantify similarity between two probability distributions, nonparametric methods like Depth versus Depth Plot, and Area Under the Receiver Operating Characteristic Curve (AUC-ROC).⁵⁰

On the other hand, developers should conduct utility assessments to ensure that synthetic datasets preserve clinically meaningful patterns. Evaluating predictive model performance on both real and synthetic health records helps detect loss of clinical signal before deployment. Achterberg et al.⁷² suggest that the utility of synthetic data is usually evaluated by determining whether it can replace real data in standard analytical tasks while maintaining comparable performance. This evaluation entails comparing the performance of models trained on synthetic and real data when tested on a real test set. This approach is commonly referred to as the Train Synthetic Test Real (TSTR).^50,72 For example, in assessing the utility of synthetic data, Hernandez et al.⁵⁰ trained several machine learning models on both real and synthetic tabular datasets and evaluated them using real data. They then compared the performance differences between models trained on real and synthetic data, calculating the statistical significance of these differences. Smaller performance gaps indicate higher utility of the synthetic data.

Equally important is the evaluation of privacy risks, which aims to uncover how effectively an external party could infer sensitive information from the original dataset if they had access to the synthetic data. It is worth noting that high-fidelity synthetic data can still leak identifiable attributes if not properly assessed. Privacy in synthetic data is a multidimensional concept, reflected in the many metrics used to quantify it. Common approaches include similarity tests, counting the number of identical records between the original and synthetic datasets, or distance-based metrics such as the mean absolute error between real and synthetic records.^74,75 However, Giomi et al.⁷⁶ suggest that these metrics often provide limited insight into the real-world risks associated with individual records. Consequently, their study proposed evaluating privacy through concrete attack scenarios, focusing on singling out, linkability, and inference attacks, which are legally mandated under general data protection regulation. Singling out occurs when the original dataset contains exactly one record with a specific, uniquely identifiable combination of attributes. Whereas linkability measures the risk of connecting multiple records to the same individual(s), which can facilitate de-anonymization. Inference involves deducing whether a real record was used in training (membership inference) by finding its closest synthetic counterpart, and/or inferring the unknown sensitive attribute of that record (attribute inference) by comparing the sensitive value of its synthetic neighbor to the real value. Moreover, to ensure privacy, the Differential Privacy (DP)⁷⁷ framework is often used to add carefully calibrated noise during data generation so that the synthetic data reflects overall patterns while guaranteeing that no single person's information can be accurately inferred.

While these aspects can be individually assessed, maintaining the tradeoff between them, for instance, between utility and privacy, remains a challenge. If not well assessed, synthetic data may lead to unpredictable utility loss and highly unpredictable privacy risks.⁷⁸ The study by Adams et al.⁷³ evaluated the tradeoffs across five synthetic data models and three patient-level datasets in terms of privacy, fidelity, and utility. The study shows that when data fidelity declines, particularly when correlation patterns are not well preserved, it negatively impacts the overall utility of the data. Moreover, achieving high levels of fidelity can also introduce significant privacy risks.

Figure 2 illustrates the proposed conceptual Framework for the Critical Evaluation of Synthetic Data in African Healthcare settings. It outlines a multilayered process for evaluating synthetic data within the African healthcare context. Evaluation begins with technical components, where three core dimensions are assessed, data fidelity, which entails how closely synthetic data replicates real-world distributions and relationships, model performance, which defines how well models are trained on synthetic data perform compared to those trained on real data, and domain adaptation, which explains how effectively synthetic data supports clinical tasks in local healthcare settings. These evaluations uncover fidelity challenges, performance limitations, and adaptation requirements.

Figure 2.

Conceptual framework for critical evaluation of synthetic data in African healthcare AI (source: authors’ construct).

Next, the framework underscores that synthetic data must be evaluated within the African healthcare context, which is characterized by resource constraints, fragmented health systems, and limited baseline data. These factors directly impact the feasibility and reliability of synthetic data generation and validation. A hybrid feedback loop combines both synthetic and real data iteratively. Finally, a robust evaluation of synthetic data must address adoption challenges, including technical infrastructure limitations, institutional fragmentation, human capacity gaps, and ethical and trust-related regulatory issues. These contextual factors shape how synthetic data is evaluated not only in terms of technical merit but also with respect to safety, privacy, acceptability, and suitability for real-world healthcare deployment. Henceforth, it is recommended that synthetic data releases should be accompanied by transparent documentation, such as data cards or model cards, detailing generation methods, intended use cases, and known limitations to guide safe adoption within African healthcare ecosystems.

Discussion

This study underscores that synthetic data offer both opportunities and constraints for healthcare AI in African contexts. Technical evaluations across the reviewed studies highlighted consistent performance gains when synthetic data were combined with small, real-world datasets, confirming that hybrid training strategies delivered superior generalization and reduced predictive bias. These indicated that synthetic-only approaches remained insufficient for most clinical use-cases in low-resource health systems where distributional gaps and population heterogeneity exist. Furthermore, persistent fidelity gaps, incomplete representation of rare conditions, and variable model stability underscore the need for stronger empirical evidence before large-scale clinical deployment in African settings.

Several knowledge gaps emerged from the synthesis of evidence. First, only a small subset of studies provided disaggregated performance metrics for clinically important subgroups, despite the well-known risks of bias amplification when synthetic datasets lack demographic diversity. Second, very few studies examined longitudinal and multimodal data beyond imaging and structured EHRs, leaving limited evidence for use-cases involving clinical notes, time-series signals, or multisource health records. Third, the governance literature revealed fragmented regulatory environments in African health systems, with minimal guidance on privacy-preserving synthetic data generation, quality control, or health technology assessment procedures. These gaps limit the ability of developers, regulators, and clinicians to evaluate the safety and fitness of synthetic data models.

The review also, identified broader set of trends, for examples the diffusion models and transformer-based generators were significantly reported to have potential of producing higher-fidelity medical text, synthetic waveforms, and pathology images. Domain adaptation methods gained traction as practical mechanisms for recalibrating models trained outside Africa to new population groups, although empirical evidence from African settings remained scarce. Also, the rise of open-source platforms for synthetic data generation signals a shift toward more accessible tooling; however, these platforms have not embraced synthetic data effectively.

The review results have implications for clinical governance and policy. Hybrid validation using real and synthetic data should be considered a minimum requirement for health technology assessment processes in Africa, particularly where synthetic data may obscure clinically relevant patterns. Investments in computational infrastructure, digital health records, and workforce capacity are necessary to support safe integration of synthetic data pipelines in routine care. Regulatory frameworks should also incorporate explicit standards for privacy evaluation, quality assurance, and dataset documentation. Without these foundations, synthetic data is risked in reinforcing existing inequities rather than mitigating data scarcity.

Priorities areas for future research are threefold. First, empirical studies must evaluate synthetic data across diverse African populations to quantify fidelity at the level of rare diseases, comorbidities, and regional variations. Second, the field requires scalable and low-cost tools for context-aware domain adaptation, allowing models pretrained elsewhere to be safely recalibrated using minimal local data. Third, governance scholars and health system actors should co-design regulatory mechanisms that integrate technical benchmarks with ethical, legal, and operational considerations. Advancing these research priorities areas will support a coherent pathway for safe and equitable adoption of synthetic data in African healthcare AI ecosystems.

Conclusion

Synthetic data has been gaining attention as a potential solution to the problem of data scarcity, particularly in resource-constrained settings. However, its usefulness is limited due to issues such as data fidelity, model generalizability, and domain adaptation. While there has been an increased focus on generating synthetic data, the performance difference between synthetic-only models and hybrid models persists. Limited infrastructure and institutional fragmentation maintain adoption barriers to synthetic-data-based AI systems in African healthcare contexts. The synthetic data could be yet another unrealized innovation without investment in digital infrastructure, interdisciplinary collaboration, and ethical oversight. Contextualized approaches, such as hybrid frameworks that incorporate synthetic and real data, complemented by robust validation procedures, seem to be a more viable solution moving forward. We recommend that policymakers: (1) mandate hybrid synthetic–real data validation in HTA processes; (2) establish national and regional regulatory standards for synthetic data in healthcare AI; (3) invest in infrastructure and capacity to enable fidelity benchmarking and domain adaptation; and (4) embed synthetic data strategies into health information systems to ensure interoperability, transparency, and trust. These steps will support the integration of equitable, effective, and sustainable AI in African healthcare.

Footnotes

Acknowledgements

The authors acknowledge the support of the University of Dodoma for their research infrastructure,and IDRC/SIDA for financial support under the AI4D Africa program.

ORCID iD

Deo Shao

Ethical approval

Ethical approval was not required for this desk-based research study,as it involved no human subjects or sensitive personal data.

Contributorship

All authors contributed equally to the conceptualization of this study and the writing of the manuscript. All authors approved the final manuscript for submission.

Funding

The authors disclosed receipt of the following financial support for the research,authorship,and/or publication of this article: This work was partly supported by U.K. International Development and the International Development Research Centre (IDRC) (grant number 109704-001/002),Ottawa,Canada,as part of Artificial Intelligence (AI) for Development: Responsible AI,Empowering People Program (AI4D),under Grant 110470–001.

Guarantor

The Corresponding author acted as the guarantor of the study and takes responsibility for the integrity of the work.

Informed Consent

Informed consent was not required because this study was a desk-based and involved no human participant,human data or identifiable personal information.

Declaration of conflicting interests

The authors declared no potential conflicts of interest with respect to the research,authorship,and/or publication of this article.

References

Jiang

, et al. Artificial intelligence in healthcare: past, present and future. Stroke Vasc Neurol 2017; 2: 230–243.

Bohr A and Memarzadeh K. The rise of artificial intelligence in healthcare applications. In: Bohr A and Memarzadeh K (eds) Artificial intelligence in healthcare. Cambridge, MA: Academic Press, 2020, pp.25-60. DOI: 10.1016/B978-0-12-818438-7.00002-2.

Kim

, et al. Changes in cancer detection and false-positive recall in mammography using artificial intelligence: a retrospective, multireader study. Lancet Digit Heal 2020; 2: e138–e148.

McKinney

, et al. International evaluation of an AI system for breast cancer screening. Nature 2020; 577: 89–94.

Becker

, et al. Artificial intelligence-based detection of pneumonia in chest radiographs. Diagnostics 2022; 12: 1–10.

Raghunath

, et al. Deep neural networks can predict new-onset atrial fibrillation from the 12-lead ECG and help identify those at risk of atrial fibrillation-related stroke. Circulation 2021; 143: 1287–1298.

Zhao

Zou

, et al. Artificial intelligence for diabetic retinopathy. Chin Med J (Engl) 2022; 135: 253–260.

Bellemo

, et al. Artificial intelligence using deep learning to screen for referable and vision-threatening diabetic retinopathy in Africa: a clinical validation study. Lancet Digit Heal 2019; 1: e35–e44.

Haenssle

, et al. Man against machine: diagnostic performance of a deep learning convolutional neural network for dermoscopic melanoma recognition in comparison to 58 dermatologists. Ann Oncol 2018; 29: 1836–1842.

10.

Han

, et al. Augmented intelligence dermatology: deep neural networks empower medical professionals in diagnosing skin cancer and predicting treatment options for 134 skin disorders. J Invest Dermatol 2020; 140: 1753–1761.

11.

Somashekhar

, et al. Watson for oncology and breast cancer treatment recommendations: agreement with an expert multidisciplinary tumor board. Ann Oncol 2018; 29: 418–423.

12.

Topol

. High-performance medicine: the convergence of human and artificial intelligence. Nat Med 2019; 25: 44–56.

13.

Dicuonzo

Donofrio

Fusco

, et al. Healthcare system: moving forward with artificial intelligence. Technovation 2022; 120: 102510.

14.

Maleki Varnosfaderani

Forouzanfar

. The role of AI in hospitals and clinics: transforming healthcare in the 21st century. Bioengineering 2024; 11: 1–38.

15.

Akingbola

Adegbesan

Ojo

, et al. Artificial intelligence and cancer care in Africa. J Med Surgery, Public Heal 2024; 3: 100132.

16.

Nxumalo

Irusen

Allwood

, et al. The utility of artificial intelligence in identifying radiological evidence of lung cancer and pulmonary tuberculosis in a high-burden tuberculosis setting. South African Med J 2024; 114: e1846.

17.

Mwanga

, et al. Detection of malaria parasites in dried human blood spots using mid-infrared spectroscopy and logistic regression analysis. Malar J 2019; 18: 1–13.

18.

Adedinsewo

, et al. Screening for peripartum cardiomyopathies using artificial intelligence in Nigeria (SPEC-AI Nigeria): clinical trial rationale and design. Am Heart J 2023; 261: 64–74.

19.

Distor

Campos Ruas

Isagah

, et al. Emerging technologies in Africa: artificial intelligence, Blockchain, and Internet of Things applications and way forward. In: CEGOV 2023: 16th International Conference on Theory and Practice of Electronic Governance (eds Demi

Ida

Mete

) Belo Horizonte Brazil, September 26 - 29, 2023, pp.33–40. New York, NY: Association for Computing Machinery.

20.

Adlung

Cohen

Mor

, et al. Machine learning in clinical decision making. Med 2021; 2: 642–665.

21.

Alberto

IRI

, et al. The impact of commercial health datasets on medical research and health-care algorithms. Lancet Digit Heal 2023; 5: e288–e294.

22.

Owoyemi

Osiyemi

, et al. Artificial intelligence for healthcare in Africa. Front Digit Heal 2020; 2. DOI: https://doi.org/10.3389/fdgth.2020.00006

23.

Dadzie Ephraim

Kotam

Duah

, et al. Application of medical artificial intelligence technology in sub-Saharan Africa: prospects for medical laboratories. Smart Heal 2024; 33: 100505.

24.

Tshimula

, et al. Redesigning electronic health record systems to support developing countries. In Proceedings of the 2023 7th International Conference on Medical and Health Informatics, 2023, pp. 216–221. doi:10.1145/3608298.3608338.

25.

Musa

Haruna

Manirambona

, et al. Paucity of health data in Africa: an obstacle to digital health implementation and evidence-based practice. Public Health Rev 2023; 44. DOI: https://doi.org/10.3389/phrs.2023.1605821

26.

Abdul-Rahman

, et al. Inaccessibility and low maintenance of medical data archive in low-middle income countries: mystery behind public health statistics and measures. J Infect Public Health 2023; 16: 1556–1561.

27.

African Union. Continental artificial intelligence strategy. 2024. [Online]. Available: https://au.int/sites/default/files/documents/44004-doc-EN-_Continental_AI_Strategy_July_2024.pdf

28.

Thomford

, et al. Implementing artificial intelligence and digital health in resource-limited settings? Top 10 lessons we learned in congenital heart defects and cardiology. Omi A J Integr Biol 2020; 24: 264–277.

29.

Leo

Luhanga

Michael

. Machine learning model for imbalanced cholera dataset in Tanzania. Sci World J 2019; 2019. doi:10.1155/2019/9397578

30.

Sukums

, et al. The use of artificial intelligence-based innovations in the health sector in Tanzania: a scoping review. Heal Policy Technol 2023; 12: 100728.

31.

Erdman

, et al. Barriers and opportunities to improve renal outcomes in South Africa using AI technology for pediatric ultrasound interpretation. In ACM International Conference Proceeding Series, 2022. doi:10.1145/3572334.3572379.

32.

Mbunge

Milham

Sibiya

, et al. Machine learning techniques for predicting malaria: unpacking emerging challenges and opportunities for tackling malaria in sub-Saharan Africa. Artif Intell Appl Netw Syst 2023; 327–344.

33.

Kamulegeya

, et al. Using artificial intelligence on dermatology conditions in Uganda: a case for diversity in training data sets for machine learning. Afr Health Sci 2023; 23: 753–763.

34.

Masinde

. Africa’s malaria epidemic predictor: application of machine learning on malaria incidence and climate data. ACM Int Conf Proc Ser 2020; 29–37. doi:10.1145/3388142.3388158

35.

Bottomley

Thaldar

. Liability for harm caused by AI in healthcare: an overview of the core legal concepts. Front Pharmacol 2023; 14: 1–9.

36.

Manson

, et al. Africa’s readiness for artificial intelligence in clinical radiotherapy delivery: medical physicists to lead the way. Phys Medica 2023; 113: 102653.

37.

Mendes

Barbar

Refaie

. Synthetic data generation: a privacy-preserving approach to accelerate rare disease research. Front Digit Heal 2025; 7: 1–8.

38.

Dhawan

Nijhawan

. Cross-modality synthetic data augmentation using GANs: enhancing brain MRI and chest X-ray classification. medRxiv, 2024, 2006–2024, doi:10.1101/2024.06.09.24308649.

39.

Yelmen

, et al. Creating artificial human genomes using generative neural networks. PLoS Genet 2021; 17: 1–22.

40.

Al-Dhamari

Abu Attieh

Prasser

. Synthetic datasets for open software development in rare disease research. Orphanet J Rare Dis 2024; 19: 1–8.

41.

Yilmaz

Korn

. A comprehensive guide to Generative Adversarial Networks (GANs) and application to individual electricity demand. Expert Syst Appl 2024; 250: 123851.

42.

Chan

Rabaev

Pratama

. Generation of synthetic manufacturing datasets for machine learning using discrete-event simulation. Prod Manuf Res 2022; 10: 337–353.

43.

McDuff

Curran

Kadambi

. Synthetic data in healthcare. 2023, 1–15, [Online]. Available: http://arxiv.org/abs/2304.03243

44.

Mumuni

Gerrar

. A survey of synthetic data augmentation methods in machine vision. Mach Intell Res 2024; 21: 831–869.

45.

Goceri

. Medical image data augmentation: techniques, comparisons and interpretations. Netherlands: Springer Netherlands, 2023, vol. 56, no. 11. DOI: 10.1007/s10462-023-10453-z.

46.

Pezoulas

, et al. Synthetic data generation methods in healthcare: a review on open-source tools and methods. Comput Struct Biotechnol J 2024; 23: 2892–2910.

47.

Innovative Health Initiative. SYNTHIA: synthetic data generation framework for integrated validation of use cases and AI healthcare applications. https://www.ihi.europa.eu/projects-results/project-factsheets/synthia (accessed August 04, 2025).

48.

Moazemi

, et al. NFDI4Health workflow and service for synthetic data generation, assessment and risk management. Stud Health Technol Inform 2024; 317: 21–29.

49.

Nisevic

Milojevic

Spajic

. Synthetic data in medicine: legal and ethical considerations for patient profiling. Comput Struct Biotechnol J 2025; 28: 190–198.

50.

Hernandez

Osorio-Marulanda

Catalina

, et al. Comprehensive evaluation framework for synthetic tabular data in health: fidelity, utility and privacy analysis of generative models with and without privacy guarantees. Front Digit Heal 2025; 7: 1576290.

51.

Kaabachi

, et al. A scoping review of privacy and utility metrics in medical synthetic data. NPJ Digit Med 2025; 8: 60.

52.

Bhanot

Erickson

, et al. The problem of fairness in synthetic healthcare data. Entropy 2021; 23: 1165.

53.

Grant

Booth

. A typology of reviews: an analysis of 14 review types and associated methodologies. Health Info Libr J 2009; 26: 91–108.

54.

Snyder

. Literature review as a research methodology: an overview and guidelines. J Bus Res 2019; 104: 333–339.

55.

Paré

Trudel

Jaana

, et al. Synthesizing information systems knowledge: a typology of literature reviews. Inf Manag 2015; 52: 183–199.

56.

Choi

Biswal

Malin

, et al. Generating multi-label discrete patient records using generative adversarial networks. 2017; 68: 1–20 [Online]. Available: http://arxiv.org/abs/1703.06490

57.

Baowaly

Lin

Liu

, et al. Synthesizing electronic health records using improved generative adversarial networks. J Am Med Informatics Assoc 2019; 26: 228–241.

58.

DuMont Schütte

, et al. Overcoming barriers to data sharing with medical image generation: a comprehensive evaluation. npj Digit Med 2021; 4: 1–14.

59.

Tucker

Wang

Rotalinti

, et al. Generating high-fidelity synthetic patient data for assessing machine learning healthcare software. npj Digit Med 2020; 3. doi:10.1038/S41746-020-00353-9

60.

Usman Akbar

Larsson

Blystad

, et al. Brain tumor segmentation using synthetic MR images—a comparison of GANs and diffusion models. Sci Data 2024; 11: 1–17.

61.

Ktena

, et al. Generative models improve fairness of medical classifiers under distribution shifts. Nat Med 2024; 30: 1166–1173.

62.

Gao

, et al. Synthetic data accelerates the development of generalizable learning-based algorithms for X-ray image analysis. Nat Mach Intell 2023; 5: 294–308.

63.

Ghosheh

Thwaites

Zhu

. Synthesizing electronic health records for predictive models in low-middle-income countries (LMICs). Biomedicines 2023; 11: 1–13.

64.

Boraschi

van der Schaar

Costa

, et al. Governing synthetic data in medical research: the time is now. Lancet Digit Heal 2025; 7: e233–e234.

65.

Gonzales

Guruswamy

Smith

. Synthetic data in health care: a narrative review. PLoS Digit Heal 2023; 2: 1–16.

66.

Sizikova

, et al. Synthetic data in radiological imaging: current state and future outlook. BJR|Artificial Intell 2024; 1. doi:10.1093/bjrai/ubae007

67.

Jordon

Yoon

Van Der Schaar

. PATE-GaN: generating synthetic data with differential privacy guarantees. 7th International Conference of Learning Representations ICLR 2019, 2019, pp. 1–21.

68.

Arora

. Generative adversarial networks and synthetic patient data: current challenges and future perspectives. Futur Healthc J 2022; 9: 190–193.

69.

Giuffrè

Shung

. Harnessing the power of synthetic data in healthcare: innovation, application, and privacy. npj Digit Med 2023; 6: 1–8.

70.

Nyamawe

. Is the public sector Africa’s hidden force for AI-driven healthcare transformation? Telemat Informatics Reports 2025; 20: 100258.

71.

Hernandez

Epelde

Alberdi

, et al. Synthetic data generation for tabular health records: a systematic review. Neurocomputing 2022; 493: 28–45.

72.

Achterberg

Haas

Spruit

. On the evaluation of synthetic longitudinal electronic health records. BMC Med Res Methodol 2024; 24: 181.

73.

Adams

, et al. On the fidelity versus privacy and utility trade-off of synthetic patient data. iScience 2025; 28.

74.

Yao

Krčo

Ganev

, et al. The DCR delusion: measuring the privacy risk of synthetic data. In European Symposium on Research in Computer Security 2025, pp. 469–487.

75.

Pilgram

, et al. A consensus privacy metrics framework for synthetic data. Patterns 2025; 6.

76.

Giomi

Boenisch

Wehmeyer

, et al. A unified framework for quantifying privacy risk in synthetic data. arXiv Prepr.arXiv2211.10459, 2022.

77.

Dwork

. Differential privacy: a survey of results. International Conference on Theory and Applications of Models of Computation, 2008, pp. 1–19.

78.

Sarmin

Sarkar

Wang

Synthetic data: revisiting the privacy-utility trade-off: F. Jahan Sarmin, et al.

Int J Inf Secur 2025; 24: 156.

On the use of synthetic data for healthcare AI in Africa: Technical performance,governance challenges,and policy readiness

Abstract

Purpose

Methods

Results

Conclusion

Keywords

Introduction

Data scarcity and application of synthetic data

Clinical data scarcity

Synthetic data

Methodology

Results

Technical performance and model reliability

Data fidelity and risks of bias

Domain adaptation for low-resource health systems

Adoption barriers and policy readiness in Africa

Evaluation of synthetic data

Discussion

Conclusion

Footnotes

Acknowledgements

ORCID iD

Ethical approval

Contributorship

Funding

Guarantor

Informed Consent

Declaration of conflicting interests

References