Sage Journals: Discover world-class research

Abstract

This article proposes three methods of integrating machine learning (ML) and artificial intelligence (AI) techniques into qualitative data analysis procedure. Data science have revolutionized sectors like medicine, business, and psychology. This integration has led to the development of complex models, improving research understanding. While quantitative research has embraced ML and AI, their application in qualitative research remains underexplored. However, these techniques offer faster coding capabilities for thematic analysis compared to human analysis, though challenges arise in resolving conflicting codes. Despite this, ML and AI have the potential to enhance the depth of findings and offer triangulation in text data analysis. They should be viewed as tools to assist qualitative researchers rather than replacements for human analysis. Various integration approaches, such as natural language processing and artificial neural networks, can be employed at different stages of qualitative research, ultimately improving trustworthiness and relevance, especially in time-sensitive scenarios like public health emergencies.

Keywords

machine learning artificial intelligence qualitative research hybrid model

Technical Note

Over the past decade, data science methods has triggered a radical shift across various sectors, including but not limited to medicine and health care, business and finance, psychology and neuroscience physics and biology (Rahul et al., 2023; Sarker, 2021). Particularly, the integration of machine learning (ML) and artificial intelligence (AI) techniques into both basic and applied research has significantly broadened the transdisciplinary scope of data science. This integration has ushered in novel, multidimensional complexity models, which provide insights from machine and human that aimed at better understanding the underlying research problems (Sarker, 2021). In quantitative research, ML- and AI-powered tools have helped improve quality of evidence generation and facilitate more robust inference of findings. Evidence can be observed in the surge in quantitative studies that leverage on ML- and AI- techniques in recent years (Morande, 2022; Pena-Guerrero et al., 2021; Vinod & Prabaharan, 2020; Waller & Fawcett, 2013).

Conversely, the application of ML- and AI- techniques in qualitative remains underexplored and largely untapped (Longo, 2020). Nonetheless, the integration of ML- and AI- techniques can be very helpful for qualitative researchers. These techniques have demonstrated significantly faster qualitative coding capabilities for thematic analysis compared to human analysis, which is inherently slower and susceptible to bias (Towler et al., 2023). Various approaches for integrating ML- and AI techniques into qualitative thematic analysis are possible, including unsupervised machine learning techniques such as natural language processing, text mining, and apriori association rules, as well as supervised machine learning techniques such as artificial neural network, and transfer learning. The critical consideration lies in determining the appropriate juncture for integration within the qualitative data analysis framework. Insights from existing mixed methods study designs, can guide this integration. The integration can occur before (exploratory sequential integration), after (explanatory sequential integration), or during (convergent parallel integration) qualitative analysis, yielding hybrid qualitative-machine learning insights. We suggest visualizing such integration as portrayed in Figure 1.

Figure 1.

Research Designs Integrating Qualitative and Machine Learning Techniques. ML = Machine Learning; DL = Deep Learning; AI = Artificial Intelligence.

The primary challenge arising from this integration pertains to addressing ambiguous or conflicting ML-generated qualitative coding compared to human-generated coding. While methodological studies to guide researchers in this regards are limited, potential solutions may involve the intervention of one or more external examiners to resolve conflicting codes, although the debate persists regarding whether these examiners should be human or machine. Notwithstanding these challenges, it is evident that ML- and AI techniques have the potential to enhance the rigour and depth of findings by offering triangulation of text data analysis and generating more meaningful insights into the study phenomenon (Chen et al., 2018). It is crucial to be understood that the use of MI are not as a replacement for human especially in qualitative research but to be utilised as a tool in assisting qualitative researchers (Christou, 2023).

Example Case Study

Title

Understanding Public Sentiment and Behavioral Drivers During a Pandemic Response (See Figure 2 for Summary Workflow)

Figure 2.

Workflow Diagram Illustrating the Integration of Machine Learning Techniques at Pre-analysis, During-Analysis, and Post-analysis Stages in a Qualitative Study of Public Sentiment and Behavior During a Pandemic Response.

Context

A team of public health researchers aims to understand how the public perceives and responds to vaccination campaigns during a pandemic. They collect thousands of open-ended survey responses, social media comments, and transcribed interviews from a diverse population.

Research Objective

To identify prevailing themes, emotional drivers, and barriers influencing vaccination uptake, with the goal of informing policy and communication strategies.

Integration of both qualitative and machine learning techniques

Exploratory Sequential Integration

Researchers could use technique such as unsupervised ML — Natural Language Processing (NLP) and Topic Modeling (e.g., LDA) before formal human coding. The research team runs unsupervised topic modeling to cluster the data into latent themes (e.g., trust, fear, misinformation, access).

Outcome

At this stage, the researchers reveals unexpected clusters such as conspiracy-related discourse and confusion about eligibility criteria, prompting researchers to refine their qualitative inquiry framework and interview questions.

Convergent Parallel Integration

Researchers could use technique such as supervised ML — Transfer Learning using BERT + Manual Coding, a subset of data (e.g., 500 manually coded responses) is used to train a supervised classifier. A BERT-based model is then applied to the rest of the dataset to assist coding.

Conflict Resolution Strategy

When model-generated codes conflict with human codes, an external examiner (e.g., a senior qualitative researcher) reviews and adjudicates the discrepancies. Conflicts that reflect nuanced meaning (e.g., sarcasm or irony) are flagged for deeper interpretation.

Outcome

This hybrid process accelerates analysis while maintaining interpretive depth and ethical sensitivity. Codes with high agreement between human and machine are prioritized for rapid synthesis; ambiguous segments receive more nuanced review.

Explanatory Sequential Integration

Researchers could use technique such as Neural Network and Pattern Recognition. After the main thematic analysis is complete, a deep learning model is used to explore patterns across demographic groups, for example.

Outcome

The model identifies that younger respondents are more likely to express mistrust, while older respondents discuss logistical challenges. These insights lead to segmentation of recommendations by demographic group and inform targeted communication strategies for each group.

ML- and AI- techniques serve as useful tools for supporting qualitative researchers analysing large text datasets quickly and interpreting findings with enhanced trustworthiness, depth, and multidimensional perspectives, particularly the agreeableness of human-generated and machine-generated codes and themes (Chen et al., 2018; Towler et al., 2023). ML- and AI- techniques can highlight complex patterns and connections in data that can be missed by human researchers which may lead to groundbreaking insights and a deeper understanding of the research (Badrulhisham et al., 2024). This approach holds particular promise for supporting and enhancing existing grounded theories, conceptual models, and hypotheses, especially during time-sensitive scenarios such as public health emergencies like the COVID-19 pandemic (Chen et al., 2018; Towler et al., 2023). Ultimately, the integration of these techniques stands to elevate the level of evidence derived from qualitative studies, thereby enhancing their impact and relevance in informing decision- and policy- making.

Footnotes

ORCID iD

Hanif Abdul Rahman

Author Contributions

All authors contributed to the conception or design of the paper. All authors contributed to the interpretation,and drafting/editing the manuscript. All authors were involved in revising the manuscript,providing critical comments,and agreed to be accountable for all aspects of the work and any issues related to the accuracy or integrity of any part of the work.

Funding

The author(s) received no financial support for the research,authorship,and/or publication of this article.

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research,authorship,and/or publication of this article.

References

Badrulhisham

Pogatzki-Zahn

Segelcke

Spisak

Vollert

(2024). Machine learning and artificial intelligence in neuroscience: A primer for researchers. Brain, Behavior, and Immunity, 115, 470–479. https://doi.org/10.1016/j.bbi.2023.11.005

Chen

N.-C.

Drouhard

Kocielnik

Suh

Aragon

C. R.

(2018). Using machine learning to support qualitative coding in social science: Shifting the focus to ambiguity. ACM Transactions on Interactive Intelligent Systems (TiiS), 8(2), 1–20. https://doi.org/10.1145/3185515

Christou

P. A.

(2023). The use of artificial intelligence (AI) in qualitative research for theory development.

Longo

(2020). Empowering qualitative research methods in education with artificial intelligence BT - computer supported qualitative research In. Costa

A. P.

Reis

L. P.

Moreira

, (Eds.) (pp. 1–21). Springer International Publishing.

Morande

(2022). Enhancing psychosomatic health using artificial intelligence-based treatment protocol: A data science-driven approach. International Journal of Information Management Data Insights, 2(2), Article 100124. https://doi.org/10.1016/j.jjimei.2022.100124

Pena-Guerrero

Nguewa

P. A.

García-Sosa

A. T.

(2021). Machine learning, artificial intelligence, and data science breaking into drug design and neglected diseases. Wiley Interdisciplinary Reviews: Computational Molecular Science, 11(5), Article e1513.

Rahul

Banyal

R. K.

Arora

(2023). A systematic review on big data applications and scope for industrial processing and healthcare sectors. Journal of Big Data, 10(1), 133. https://doi.org/10.1186/s40537-023-00808-2

Sarker

I. H.

(2021). Data science and analytics: An overview from data-driven smart computing, decision-making and applications perspective. SN Computer Science, 2(5), 377. https://doi.org/10.1007/s42979-021-00765-8

Towler

Bondaronek

Papakonstantinou

Amlôt

Chadborn

Ainsworth

Yardley

(2023). Applying machine-learning to rapidly analyze large qualitative text datasets to inform the COVID-19 pandemic response: Comparing human and machine-assisted topic analysis techniques. Frontiers in Public Health, 11, Article 1268223. https://doi.org/10.3389/fpubh.2023.1268223

10.

Vinod

D. N.

Prabaharan

S. R. S.

(2020). Data science and the role of Artificial Intelligence in achieving the fast diagnosis of Covid-19. Chaos, Solitons & Fractals, 140, Article 110182. https://doi.org/10.1016/j.chaos.2020.110182

11.

Waller

M. A.

Fawcett

S. E.

(2013). Data science, predictive analytics, and big data: A revolution that will transform supply chain design and management. Journal of Business Logistics, 34(2), 77–84. https://doi.org/10.1111/jbl.12010

Integration of Machine Learning and Artificial Intelligence Techniques for Qualitative Research: The Rise of New Research Paradigms

Abstract

Keywords

Technical Note

Example Case Study

Title

Context

Research Objective

Exploratory Sequential Integration

Outcome

Convergent Parallel Integration

Conflict Resolution Strategy

Outcome

Explanatory Sequential Integration

Outcome

Footnotes

ORCID iD

Author Contributions

Funding

Declaration of Conflicting Interests

References