Abstract
Keywords
Introduction
Thematic analysis has emerged as a foundational method within qualitative research. Thematic analysis is widely employed for identifying, analysing, and interpreting patterns of meaning within data (Braun & Clarke, 2006; Lindberg et al., 2024; Sandelowski, 2004). It offers a flexible yet systematic framework for coding and theme development, which enables researchers to produce rich, detailed, and nuanced accounts of complex datasets (Braun & Clarke, 2016; Chan, 2025; Guest et al., 2011; Pope et al., 2000). This interpretive process facilitates an in-depth exploration of participants’ lived experiences and perceptions, or, in the case of textual sources, embedded discourses (Boyatzis, 1998; Holloway & Todres, 2003; Kelleher & Murphy, 2024). More recently, the application of thematic analysis to secondary research, such as previously collected qualitative data, documents, or literature, has gained prominence; this approach allows scholars to revisit and reinterpret existing material through new theoretical or contextual lenses (Ozuem et al., 2022). The increased interest in secondary data analysis reflects broader trends within qualitative enquiry, particularly in response to rapid technological developments. Secondary data is information that has been collected previously by other researchers, institutions, or organisations for purposes distinct from the current research project (Saunders et al., 2019). Common sources include publicly available datasets, government records, organisational documents, historical archives, social media outputs, and previously published research findings. In qualitative research, secondary data can take diverse forms, such as interview transcripts, ethnographic fieldnotes, diaries, blogs, and multimedia records (Saunders et al., 2019). The recent emergence of generative artificial intelligence (AI) technologies, such as ChatGPT, Gemini, DeepSeek, and Copilot, has expanded the landscape of secondary data. AI-generated content, including conversational transcripts, technical documentation, and creative outputs, now constitutes an increasingly important source for qualitative analysis (Panke, 2025). This development offers researchers new opportunities to investigate contemporary digital phenomena, extend existing theoretical frameworks, and explore the evolving dynamics of human–machine interactions (Crompton & Burke, 2023). The growing diversity of secondary data sources thus not only facilitates the investigation of novel research questions and the extension of existing findings but also enables longitudinal and comparative analyses without the necessity of recollecting similar information.
Extensive research has documented that secondary data have long been recognised as an important resource; secondary data offer researchers access to extensive and rich datasets without the logistical and financial demands associated with primary data collection (Maxwell, 2013; Saunders et al., 2019). In an increasingly data-driven world, secondary data have become even more central to research practices across a range of disciplines, particularly as vast volumes of information are generated through digital platforms, archives, and administrative records (Brodnik et al., 2023). However, despite its growing prominence, the use of secondary data presents both significant opportunities and substantial methodological and ethical challenges, especially when approached through the lens of thematic analysis. These challenges are particularly salient because thematic analysis, by its nature, is deeply interpretive and context-sensitive. When applied to secondary data, thematic analysis requires careful attention to the circumstances in which the original data were produced, as this context significantly shapes the meanings and patterns that can be legitimately constructed (Braun & Clarke, 2006). Researchers should therefore exercise reflexivity, recognising that their thematic interpretations are influenced both by the original intent behind the data and by their own theoretical positioning. The richness and diversity of secondary datasets, particularly those derived from digital platforms, provide fertile ground for thematic exploration and offer possibilities for the examination of contemporary social phenomena through an interpretive lens. This potential should be balanced against the risk of decontextualisation and the ethical imperative to respect the intentions, privacy, and dignity of the original participants (Naeem et al., 2024).
To the best of our knowledge, there is currently no published research that explicitly examines the influence of generative AI on the sensemaking processes associated with secondary data analysis. While scholars across various disciplines have contributed to a modest and emerging body of literature on thematic analysis, much of this research remains limited in scope, particularly in relation to technology-mediated contexts. Existing contributions have predominantly focused on traditional applications of thematic analysis, often overlooking the epistemological and methodological implications of integrating AI into qualitative research practices (Brodnik et al., 2023; Naeem et al., 2024). A significant gap persists in understanding how generative AI technologies affect the interpretative work of researchers engaging with secondary data. Thematic analysis, as a method grounded in the identification and interpretation of patterns of meaning, is inherently shaped by the data available and the tools used to analyse the data. Generative AI, particularly large language models (LLMs), can curate, synthesise, and reframe vast quantities of secondary information, which could potentially alter how researchers perceive thematic structures and construct meaning from the data (Lixandru, 2024). However, the implications of such interventions, both in terms of epistemic authority and analytical transparency, have yet to be fully theorised. While extant studies have paid close attention to the readily apparent and evolving dynamics of AI within qualitative research, there remains a pressing need for these transformations to be both theorised and empirically traced, particularly in relation to thematic analysis and the use of secondary data. This dimension is crucial, as generative AI enables unprecedented access to a wide array of publicly available, semi-structured, and unstructured data sources. Researchers can now engage with metadata, explanatory texts, and associated content that may significantly shape the interpretation of rankings or other evaluative mechanisms. The absence of scholarly attention to this aspect limits our understanding of how data presentation influences sensemaking and thematic extraction in AI-supported analysis.
This insight is of particular importance for both researchers and practitioners. As technological advances make it increasingly feasible to retrieve and analyse a gamut of information related to specific phenomena (Castillo-Segura et al., 2023; Ozuem & &Willis, 2025), the capacity for thematic analysis to produce credible and contextually sensitive insights depends on how meaning is constructed in relation to both the content and the structure of secondary data. Generative AI does not merely expand access to data – it also mediates how researchers engage with and make sense of those data (Dwivedi et al., 2023). As such, the need for methodologically reflective approaches becomes even more pronounced. Without critical engagement with the nature of the data and the tools used to analyse the data, there is a risk of generating findings that are overly reliant on algorithmic patterns and insufficiently grounded in human interpretation. This paper, therefore, seeks to contribute to the emerging discourse by situating generative AI within the context of thematic analysis across both primary and secondary data; it highlights the need for further research into how AI tools shape the researcher’s role in constructing knowledge, and emphasises the importance of transparency, reflexivity, and contextual awareness in technology-mediated qualitative enquiry.
Drawing on the work of Ozuem et al. (2022) on thematic analysis, we explore how thematic analysis can facilitate the analysis of primary and secondary data. We respond to calls for a more inclusive understanding of the practices surrounding thematic analysis that considers a broader range of issues and diverse research settings (Morgan, 2023; Naeem et al., 2023). Our contribution is threefold. First, we highlight a critical yet previously underexplored dimension of thematic analysis that becomes increasingly salient in the era of generative AI (Lixandru, 2024). In doing so, we offer a more refined understanding of the notion of iteration within the thematic analysis process when working with primary and secondary data. Second, we develop the RIPES (Reflexivity, Interpretation, Procedural consistency, Evaluation, and Situatedness) model to conceptualise how thematic analysis can enhance researchers’ engagement with primary and secondary data. Our study not only proposes a generalised model of thematic analysis within the context of primary and secondary data use but also stimulates future research by identifying the benefits, limitations, and practices that strengthen the RIPES framework. Third, we examine how thematic analysis enriches qualitative enquiry within AI-driven contexts. Our work thus contributes to a growing body of scholarship examining how thematic analysis can support primary and secondary data analysis, and offers a critical counterpoint to more optimistic portrayals of the burgeoning availability of secondary data.
Conceptual Background
Summary of Research, Context, and Findings on Thematic Data Analysis
Of particular relevance to the present study is an exploration of the concepts, antecedents, and consequences associated with the emergence of generative AI technologies in relation to thematic analysis. Generative AI, particularly models such as OpenAI’s GPT series and other LLMs, are increasingly influencing how qualitative researchers approach data collection, analysis, and interpretation (Bartelheimer et al., 2023). Conceptually, generative AI refers to machine learning systems capable of producing coherent, contextually relevant text, images, or other outputs based on training on large datasets (Dwivedi et al., 2023). In the context of thematic analysis, these technologies introduce a novel intermediary between researchers and their data, which raises critical questions about authenticity, interpretation, and epistemic authority. The antecedents of generative AI’s emergence in qualitative research are manifold. Several studies indicated that advances in computational linguistics, natural language processing, and machine learning have enabled the development of sophisticated models capable of producing human-like text (Dwivedi et al., 2023; Panke, 2025). Other studies have conveyed that the proliferation of digital communication platforms and the corresponding exponential increase in available secondary data have created fertile conditions for the integration of AI in research methodologies (Engert et al., 2023; Pareschi, 2023). Researchers are now able to access vast repositories of text-based data, such as social media posts, blogs, and online forums, which facilitate the use of secondary data in qualitative studies.
However, while generative AI steadily refines and expands secondary databases, it does not fundamentally increase the quantity of original qualitative data. Instead, it enhances the accessibility and organisation of existing data, often producing synthesised or rearticulated outputs based on the patterns it identifies within its training corpus (Brodnik et al., 2023). This distinction is crucial, as it underscores the fact that the primary contribution of generative AI lies in its capacity to mediate researchers’ engagement with data rather than in expanding the empirical world itself (Sharples, 2023). Despite these opportunities, generative AI also presents significant challenges for qualitative researchers. One key concern is the risk of overreliance on AI-generated outputs, which may obscure the interpretative and reflexive processes that are fundamental to thematic analysis (Pareschi, 2023). The use of AI tools can inadvertently introduce biases embedded within training datasets, potentially reproducing systemic inequities or overlooking nuanced, context-specific meanings (Glickman & Zhang, 2024; Wei et al., 2022). Moreover, questions of data authenticity arise when researchers analyse AI-generated summaries or thematic structures rather than engage directly with participants’ original narratives.
Consequently, the integration of generative AI into thematic analysis necessitates a critical and reflective stance. Researchers are considered to be more vigilant about maintaining the integrity of qualitative enquiry when using AI, and ensure that AI serves as an assistive tool rather than as a substitute for human interpretation (Schmitt, 2024). This involves interrogating the provenance of data, acknowledging the potential for algorithmic bias, and foregrounding the researcher’s interpretative agency. In doing so, scholars can harness the advantages of generative AI, such as efficiency in handling large datasets and identifying preliminary patterns, while safeguarding the epistemological foundations of thematic analysis (Si et al., 2024). The emergence of generative AI technologies presents both opportunities and challenges for thematic analysis. While these tools facilitate the refinement and expansion of secondary databases, they do not replace the critical, interpretative approach that qualitative research demands (Perkins & Roe, 2024). Understanding the conceptual foundations, antecedents, and consequences of generative AI’s role in thematic analysis is essential for navigating this evolving landscape responsibly (Paulus & Marone, 2024).
Social Constructionist Research and Generative AI in Thematic Analysis
The social constructivist philosophical paradigm offers a distinct ontological and epistemological stance within qualitative research that privileges meaning-making as an inherently social, contextual, and interpretative process (Lincoln & Denzin, 2000; Pernecky, 2016). Rather than assuming an objective, external reality awaiting discovery, constructionism asserts that knowledge and reality are co-constructed through interaction, discourse, and cultural practice (Berger & Luckmann, 1966; Burr, 2015). This paradigm has profound implications for methodological approaches, such as thematic analysis, particularly in the current moment where AI is increasingly being integrated into qualitative research practices. Ontologically, social constructionism is aligned with relativism; relativism suggests that reality is not fixed or singular but rather multiple, fluid, and shaped through human negotiation (Crotty, 1998; Guba & Lincoln, 1994; Ozuem et al., 2025). Within this worldview, social phenomena are not taken as natural or essential but as historically and culturally contingent. Accordingly, thematic analysis conducted from a constructivist position aims not to uncover “truths” embedded in the data, but to explore how meaning is constructed through language and interaction in particular contexts (Braun & Clarke, 2006; Holstein & Gubrium, 2008; Maxwell, 2013; Ravitch & Carl, 2019; Saunders et al., 2019). When AI technologies, such as LLMs, are introduced into this interpretative process, tensions arise. LLMs operate on statistical probabilities and are trained on vast datasets to detect patterns that often disregard the relational, situated, and socially constructed nature of meaning (Glickman & Zhang, 2024). From a constructivist perspective, such tools risk reifying interpretations as though they were objective or generalisable, thus flattening the nuance and plurality that constructivist thematic analysis seeks to preserve.
Epistemologically, social constructionism embraces a subjectivist or intersubjectivist stance. Knowledge is viewed as a product of shared meanings, shaped through dialogue between researcher and participant, rather than a reflection of an external reality (Ozuem et al., 2024; Crotty, 1998; Kukla, 2000; Seale, 2000; Lincoln et al., 2018). In reflexive thematic analysis, for example, themes are seen not as emerging naturally from the data, but as actively constructed by the researcher through their interpretative lens (Hole, 2024). The use of AI challenges this principle by introducing a computational intermediary that simulates interpretation without possessing reflexivity or contextual understanding. Although AI can assist in organising large volumes of text or identifying linguistic regularities, it cannot replicate the interpretative labour embedded in socially and contextually informed analysis. Consequently, it is essential to engage with AI-generated outputs critically and reflexively, while recognising the limitations of AI models in producing knowledge aligned with a constructivist epistemology (Howell, 2013).
Axiologically, the social constructivist perspective foregrounds the value-laden nature of research. It rejects the notion of value-free enquiry and instead promotes transparency regarding the researcher’s assumptions, ethical commitments, and positionality (Denzin & Lincoln, 2018). Reflexivity is central to this approach, not only as a methodological tool but as an ethical imperative (Finlay, 2002). The introduction of AI into thematic analysis raises important axiological concerns. AI systems are trained on pre-existing corpora that often reflect dominant ideologies, cultural biases, and systemic inequalities. Without critical reflexivity, there is a risk that researchers will inadvertently reproduce these biases, thereby undermining the ethical responsibility to engage meaningfully with participants’ voices and contexts (Caputo, 2018; Glickman & Zhang, 2024; Pareschi, 2023). Reliance on AI tools may obscure the interpretative agency of the researcher, thus leading to a misleading perception of neutrality or objectivity within the analytical process. Although AI offers novel affordances for managing textual data, its integration into thematic analysis requires careful and critical consideration when situated within a social constructivist paradigm (Hennrich et al., 2024; Naeem et al., 2023). The core assumptions of constructionism – its ontological relativism, epistemological subjectivism, and axiological transparency – stand in tension with the mechanistic, decontextualised logic underpinning AI technologies. For researchers adopting a constructivist orientation, AI can be used to support, but not supplant, the deeply interpretative, context-sensitive, and reflexive work that characterises rigorous qualitative analysis.
A social constructivist epistemology conceptualises knowledge not as a fixed entity transmitted between individuals, but as an active, co-constructed process shaped through language, social interaction, and historically situated practices. Poggi’s (1965, p. 284) assertion that “a way of seeing is a way of not seeing” is especially germane to contemporary methodological debates surrounding the use of generative AI in qualitative research, particularly in the context of thematic analysis. Each interpretive lens inherently privileges specific meanings while simultaneously marginalising or obscuring others. Within this epistemological framing, generative AI systems, which are frequently deployed to support or automate qualitative analytic processes, operate through distinct interpretive logics that embed implicit assumptions about meaning, coherence, and the identification of patterns (Christou, 2023; Peng et al., 2022). When employed in thematic analysis, such systems may efficiently detect lexical repetitions and semantic clusters across voluminous datasets (Pattyn, 2024; Schmitt, 2024). Yet, this computational “way of seeing” often prioritises formal regularities at the expense of interpretive nuance. Specifically, it risks neglecting the situated, affective, and culturally contingent dimensions of meaning-making that are typically foregrounded in constructivist qualitative enquiry (Naeem et al., 2023). As a result, themes generated by AI may inadvertently reproduce dominant discourses embedded within training data while eliding subaltern voices, marginal narratives, or the complex symbolic dimensions of participants’ experiences (Christou, 2023). This tendency reflects the broader limitations of algorithmic interpretation, wherein what is rendered legible by AI may be constrained by its underlying ontological and epistemological premises. Social constructivism, therefore, serves as a critical counterpoint that urges scholars to interrogate the methodological and interpretive assumptions that underpin AI-assisted analysis. Poggi’s (1965) aphorism functions here as a methodological provocation that reminds researchers that all acts of interpretation, whether conducted by humans or delegated to machines, are necessarily partial and exclusionary. Embracing this reflexivity creates space for more pluralistic, ethically sensitive, and critically engaged qualitative research. Rather than accepting AI outputs as definitive, scholars are encouraged to consider what remains unspoken, ambiguous, or structurally silenced within data (Hennrich et al., 2024), thereby preserving the interpretive depth and contextual sensitivity that thematic analysis demands within a constructivist paradigm.
Constructivism emphasises reflexivity; it requires researchers to recognise and critically engage with their own values, assumptions, and positionalities throughout the research process. Transparency and critical self-awareness become essential practices to ensure the trustworthiness of findings. Rather than objectivity, trustworthiness, which is operationalised through criteria such as credibility, transferability, dependability, and confirmability, serves as the primary evaluative standard for constructivist research (Guba & Lincoln, 1989). Thematic analysis has emerged as a widely accepted and adaptable method for examining meaning within qualitative research. Initially valued for its flexibility and accessibility across disciplines and data types (Braun & Clarke, 2006; Hole, 2024), thematic analysis aligns closely with constructivist assumptions when applied interpretatively and allows researchers to explore the subjective realities constructed by participants. However, critiques have highlighted that thematic analysis, when applied in an atheoretical or mechanistic manner, risks overlooking the socio-cultural and historical contexts that shape participants’ accounts (Nowell et al., 2017). Without a critical, reflexive orientation, thematic analysis may inadvertently reproduce decontextualised readings, which is antithetical to constructivist principles.
Addressing these concerns, Ozuem et al. (2022) advanced a reflexive and non-linear model of thematic analysis comprising five interrelated and iterative phases. The first phase, scoping and excavation, involves critical engagement with the conceptual, contextual, and epistemological dimensions of the data, with the aim of revealing latent and often obscured meanings. The second phase, data segmentation, refers to the inductive organisation of data into analytically salient units; this process is shaped by theoretical sensitivity and the researcher’s reflexive stance. Manifestation and categorisation, the third phase, identifies core meanings by attending to both manifest content and underlying discursive structures. The fourth phase, developing categories and themes, emphasises conceptual refinement through ongoing critical evaluation and integration with theoretical constructs. Finally, meaning-making and consolidation entails the construction of a coherent, interpretive narrative that situates findings within broader socio-cultural contexts of meaning, power, and knowledge production.
Ozuem et al.’s (2022) thematic analysis model provides a valuable foundation for exploring lived experiences, particularly within dynamic and technology-mediated contexts. Their approach emphasises the importance of flexibility, contextual sensitivity, and theoretical grounding in the development of themes; it offers researchers a structured yet adaptable framework for engaging with complex and evolving data environments. Nevertheless, the increasing reliance on secondary data within qualitative research demands a critical reassessment of existing analytical models. In an era characterised by the exponential growth of digital content creation, particularly through generative AI technologies, the nature of available secondary data have become markedly more diverse and challenging. Secondary data sources now extend beyond traditional archives and qualitative datasets to include AI-generated content produced by platforms such as ChatGPT, Gemini, DeepSeek, Copilot, and other machine learning systems (Paulus & Marone, 2024). These new forms of data present unique methodological challenges, particularly concerning the authenticity, provenance, and interpretive context of the material. Moreover, issues related to ethical use, including questions of consent, ownership, and the potential biases embedded within AI-generated outputs, further complicate the application of traditional thematic analysis models. Although Ozuem et al.’s (2022) framework provides a strong basis for primary data analysis, it does not yet fully address the complexities introduced by working with heterogeneous and technologically mediated secondary sources. Consequently, there is a compelling need to extend and adapt their model so that it incorporates additional strategies that respond to the particular demands of contemporary secondary data environments. Doing so would not only enhance methodological rigour but also ensure that thematic analysis remains a robust and ethically responsible tool for interrogating emerging digital phenomena. Grounded in social constructivist principles, the following section introduces the RIPES model as a conceptual framework for advancing thematic analysis in the era of artificial intelligence.
A RIPES-Based Approach to Thematic Analysis
In addressing the opportunities and challenges of applying thematic analysis to secondary data, this paper proposes the RIPES model – a structured framework designed to support researchers and practitioners in maintaining methodological rigour and analytical robustness. As the use of qualitative secondary data becomes increasingly common, particularly with the proliferation of digital and administrative datasets, there is a pressing need for clear guidance on how to uphold the quality and ethical standards of thematic analysis in these contexts (Naeem et al., 2023). Each element of the RIPES model (Figure 1) – Reflexivity, Interpretation, Procedural consistency, Evaluation, and Situatedness – plays a vital role in ensuring that thematic analysis of secondary data remains credible, nuanced, and contextually sensitive. Each element is described in the following five subsections. RIPES Model
Reflexivity
Reflexivity is a foundational principle of qualitative research; reflexivity refers to the practice of critically examining one’s own assumptions, positionalities, and interpretative frameworks and considering how these shape and influence the research process (Lixandru, 2024). It challenges researchers to reflect on their values, identities, and epistemological orientations, recognising that these elements influence data collection, analysis, and interpretation (Gubrium & Holstein, 1997). Within thematic analysis, reflexivity is particularly significant, as it underpins how themes are constructed, coded, and interpreted. Thematic analysis spans a spectrum of methodological approaches that range from objectivist to subjectivist paradigms (Braun & Clarke, 2006; Naeem et al., 2024). Objectivist approaches assume that themes exist independently within the data and can be systematically uncovered. Subjectivist or reflexive approaches, however, argue that themes are actively constructed by the researcher through interpretative engagement. Reflexivity is essential to navigating this dichotomy, as it allows the researcher to acknowledge their role in shaping meaning and to maintain transparency about their analytical choices.
Traditionally, reflexivity has marked a key distinction between qualitative and quantitative research methodologies. While quantitative approaches aim to eliminate researcher influence through standardisation, qualitative enquiry recognises the researcher as an integral part of the knowledge production process (Howell, 2013; Lincoln & Guba, 1985; Tobin & Begley, 2004). Reflexivity enhances the trustworthiness and rigour of qualitative research by encouraging a continuous, critical self-awareness across all stages of the research process. However, the rise of generative AI, particularly LLMs such as OpenAI’s GPT series, has introduced new complexities to reflexive practice (Chiu, 2023). These tools can generate codes, summarise datasets, and even suggest thematic structures (Dwivedi et al., 2023), thereby offering both opportunities and challenges for qualitative researchers. Although AI tools may enhance the accessibility and organisation of large volumes of data, they risk distancing researchers from the interpretative depth and contextual sensitivity traditionally valued in thematic analysis. This shift necessitates a broadened form of reflexivity – one that interrogates not only the researcher’s subjectivity but also the assumptions embedded within AI technologies. As Castillo-Segura et al. (2023) explained, it is important for scholars to critically reflect not only on their own perspectives but also on how AI-generated outputs are shaped by the data on which the AI systems were trained, the embedded biases and dominant cultural narratives (Pattyn, 2024), and the assumptions underlying their design. If these systems are used uncritically, there is a risk that AI-generated outputs may overshadow the nuanced insights that emerge from direct, reflexive engagement with participants’ narratives (Schmitt, 2024).
In this context, reflexivity also encompasses the ethical and epistemological implications of using AI in qualitative research. A critical consideration arises regarding the locus of interpretative authority – whether it lies with the researcher or the algorithm – and the extent to which methodological integrity is sustained when qualitative enquiry is mediated by generative technologies (Christou, 2023). Rather than replacing human interpretation, AI should be seen as a tool that requires critical oversight and reflexive scrutiny. Reflexivity remains indispensable in thematic analysis, particularly as researchers engage with the affordances and limitations of generative AI.
Researchers should be vigilant in recognising that generative AI tools are not neutral conveyors of meaning; rather, they are active participants in the interpretative process, and are influenced by contemporary concerns, dominant disciplinary narratives, and the socio-political contexts in which they were developed. This necessitates a double layer of reflexivity: first, towards one’s own theoretical and personal biases; and second, towards the technological mediation that generative AI introduces into the analytic encounter. A researcher’s reflexivity is essential for maintaining the credibility and trustworthiness of thematic interpretations derived from AI-assisted analyses. It guards against the risk of perpetuating algorithmic biases, the imposition of presentist perspectives onto historical data, or the uncritical acceptance of AI-generated thematic suggestions. In particular, when analysing archived or historically situated datasets, there is a heightened risk that generative AI, trained predominantly on contemporary data, might obscure or distort the socio-cultural specificity of the original material. Therefore, embedding reflexivity as a central, ongoing practice in AI-mediated secondary data analysis not only strengthens methodological rigour but also upholds ethical standards. It ensures that researchers remain critically engaged with both the data and the tools they employ; thus, it fosters richer, more responsible interpretations that respect the provenance, complexity, and cultural specificity of the data under investigation.
Interpretation
Interpretation lies at the heart of qualitative enquiry, which is grounded in the commitment to understand social phenomena from the standpoint of those experiencing them. As Denzin and Lincoln (2005) argued, qualitative studies aim to capture meaning through the lens of human experience, and they pay particular attention to the situated, contextual, and relational nature of meaning-making. This interpretative orientation requires engagement with the intersubjective dimensions of the social world – those shared understandings and negotiated meanings that emerge between individuals and within communities. However, such an undertaking inherently raises methodological tensions. Specifically, researchers are tasked with addressing the dual imperative of producing transparent, rigorous analysis while acknowledging the inevitably subjective nature of their interpretative role.
Thematic analysis exemplifies these interpretative challenges. Whether situated within realist or constructivist paradigms, thematic analysis entails the identification and articulation of patterns of meaning across a dataset (Braun & Clarke, 2006, 2019). However, the act of theme development is not a neutral exercise. Rather, it is shaped by the researcher’s theoretical positioning, prior assumptions, and interpretative lens. Reflexivity thus becomes central to the practice of thematic analysis to ensure that the interpretative process is made visible and subject to critical scrutiny. Although realist approaches to thematic analysis may suggest that themes can be “discovered” within the data, more reflexive orientations emphasise that themes are actively “constructed” through an iterative and interpretative process. In both cases, interpretation is an unavoidable and defining feature.
The emergence of generative AI technologies introduces a novel layer of complexity to interpretative practices in qualitative research. LLMs are increasingly deployed to assist with data coding, pattern recognition, and even thematic structuring. While these tools offer considerable efficiencies – particularly when working with large-scale, text-rich datasets – they risk shifting the locus of interpretation away from the human researcher. By mediating the researcher’s engagement with data, these technologies raise significant questions concerning epistemic authority and methodological integrity (Christou, 2023; Pattyn, 2024; Schmitt, 2024).
AI tools do not “interpret” in the human sense. Instead, they generate outputs based on statistical associations learned from vast corpora of training data. Consequently, when used in the context of thematic analysis, generative AI may produce outputs that appear thematically coherent but lack contextual sensitivity or critical depth. This challenges researchers to consider not only what is produced by AI, but how and why those outputs are generated – and with what implications for qualitative interpretation. The risk of decontextualised, surface-level analysis is particularly salient when researchers rely on AI-generated summaries rather than engaging directly with participants’ narratives. While AI may augment certain aspects of analysis, it cannot replace the reflexive, situated judgement that defines rigorous qualitative interpretation. The task, then, is not to reject AI outright but to position it within a critically reflexive framework that foregrounds human interpretative agency and the contextual richness of qualitative data (Christou, 2023; Roberts et al., 2024).
Conducting robust thematic analysis depends on the careful and critical interpretation of both primary and secondary data sources. Researchers are encouraged to move beyond surface-level readings and engage deeply with the underlying meanings, power relations, and silences embedded within the data (Corti & Thompson, 2004). Interpretation, in this sense, involves more than identifying overt patterns; it requires an interrogation of the social, historical, and political contexts that shaped the production of the data, as well as an attentiveness to whose voices are represented and whose may be marginalised or omitted. In particular, when working with secondary data, sensitivity to the original aims, contexts, and conditions under which the data were generated is essential. Researchers are tasked with honouring the integrity of the original material while remaining open to the generation of new insights through the application of contemporary theoretical frameworks. This dual responsibility demands an interpretative approach that is both respectful and innovative, which recognises the evolving nature of meaning over time (Miles & Huberman, 1994).
Procedural Consistency
Maintaining procedural consistency is essential for producing trustworthy and credible outcomes in thematic analysis. A methodologically sound thematic analysis requires researchers to follow systematic and transparent processes that encompass data familiarisation, initial coding, theme development, and the coherent reporting of findings (Braun & Clarke, 2006). Whether adopting an inductive approach, where themes are grounded in the data itself, or a deductive one, which is driven by established theoretical frameworks, the internal coherence of the analytic procedure plays a pivotal role in establishing the dependability of the research. Procedural consistency involves more than simply following a sequence of steps; it requires a critical awareness of how each stage of the analytic process is enacted and justified. For instance, researchers benefit from clearly articulating their rationale for choosing specific coding strategies, how themes were refined over time, and how decisions were made when encountering contradictory, ambiguous, or marginal data (Ravitch & Carl, 2019). Such transparency contributes to the trustworthiness of the analysis by allowing others to assess the logic and rigour behind key analytic choices.
In the context of secondary data analysis, particularly when enhanced or mediated by generative AI, procedural consistency becomes even more significant. AI tools may accelerate certain stages of the analytic process, such as coding or clustering, but without rigorous documentation and critical oversight their use may obscure how interpretative decisions were made. Researchers working in AI-augmented contexts are therefore advised to document both human-led and AI-supported analytic actions in detail to ensure clarity in how codes and themes were identified, revised, or validated. This level of transparency enables the evaluation of the methodological integrity of the study and supports its reproducibility. Moreover, procedural consistency is vital for ensuring that thematic analysis contributes meaningfully to qualitative knowledge. Inconsistent or poorly reported procedures risk undermining the credibility of findings and diminishing the value of secondary data, which already presents challenges related to context, authenticity, and researcher detachment (Robson, 2002). By contrast, a well-documented, internally coherent analytic process enhances both the epistemic and practical value of the research, and allows findings to be situated meaningfully within wider scholarly debates and applied contexts (Flick, 2023; Ozuem et al., 2022).
Evaluation
Evaluation within the context of thematic analysis entails a sustained and critical appraisal of both the secondary data sources and the researcher’s own analytic practices. It is not a singular phase of research, but rather a continual process that begins prior to analysis and extends throughout the interpretative journey. As Patton (2001) suggested, researchers are expected to assess the relevance, quality, completeness, and trustworthiness of the secondary datasets with which they engage. Given the variability in how such data are collected, curated, and preserved, this evaluative process becomes crucial in determining whether the data are fit for the intended analytic purpose. Secondary datasets, particularly those not originally designed for the current research questions, may differ considerably in their richness, scope, and reliability. In some cases, documentation may be incomplete or critical contextual details, such as the socio-political setting, the identity of the original data collectors, or the conditions under which participants responded, may be unavailable (Patton, 1990). Where such limitations are present, researchers are advised to exercise discernment in selecting data and to be transparent about the implications of these constraints for interpretation. Critical judgement is essential to avoid overextending conclusions or misrepresenting the nature of the original data.
Equally important is the evaluation of the researcher’s own analytic process. This involves reflecting critically on how themes are developed, revised, and justified, and ensuring that these remain closely tied to the empirical material. Ongoing self-evaluation helps guard against interpretative drift, where themes may become increasingly speculative or detached from the data, and reinforces analytical rigour. As Marshall and Rossman (1999) argued, such reflective scrutiny of one’s methods enhances both the credibility and the methodological soundness of qualitative enquiry. The integration of generative AI in secondary data analysis further heightens the need for rigorous evaluation. While AI tools can assist in managing large volumes of text and proposing thematic groupings, their outputs should not be accepted uncritically. Researchers are encouraged to evaluate how AI-generated patterns align with the original data and to interrogate any discrepancies or oversights. This involves not only assessing the quality of the underlying data but also questioning how the algorithms may shape, filter, or distort interpretative outcomes. Ultimately, robust evaluation enhances the methodological integrity of thematic analysis and supports the production of credible, trustworthy findings. By critically appraising both the dataset and the analytic process – whether human-led, AI-assisted, or both – researchers help ensure that their contributions to qualitative knowledge are both empirically grounded and ethically responsible.
Guba and Lincoln’s (1989) framework for assessing trustworthiness in qualitative research provides a coherent evaluative lens that aligns with the epistemological assumptions of the social constructivist paradigm. Within this orientation, knowledge is understood as co-constructed, contextually situated, and shaped by the interaction between researcher and the phenomena under investigation. In thematic analysis, credibility involves sustained reflexive engagement with the data to ensure that emergent themes are meaningfully grounded in participants’ accounts. Transferability is supported through thick, context-rich description that allows readers to make informed judgements about the relevance of findings to other settings. The integration of generative AI into thematic analysis introduces new complexities to this evaluative framework. While AI can enhance efficiency by identifying surface-level patterns in large textual corpora, its involvement necessitates heightened reflexivity. Ensuring dependability entails clearly documenting analytic procedures and interpretive decisions, particularly in relation to the influence of AI-generated suggestions. Confirmability involves a critical appraisal of both human and computational contributions to the analytic process, with careful attention to potential distortions introduced by algorithmic outputs. Attending to these criteria enables researchers to enhance the methodological rigour of constructivist enquiry, while maintaining sensitivity to the interpretive, relational, and context-dependent nature of qualitative knowledge production (Naeem et al., 2024).
Situatedness
Attention to situatedness is fundamental when applying thematic analysis to secondary data. Situatedness refers to an awareness of, and engagement with, the socio-cultural, historical, institutional, and relational contexts in which data were originally produced (Denzin & Lincoln, 2000; Kukla, 2000). Recognising these contextual dimensions is vital for ensuring that the meanings derived from the data remain faithful to the environments and conditions from which they emerged. Unlike primary data, where researchers may directly shape and observe the context of data collection, secondary data is often removed from its original settings, thereby increasing the risk of interpretative detachment. Neglecting the situated nature of secondary data can result in decontextualised or distorted interpretations, which may inadvertently misrepresent participants’ experiences or fail to capture the nuances of the social world they inhabited (Maxwell, 2013). For example, language, behaviour, or attitudes expressed within a specific socio-political climate may take on different meanings if analysed without adequate contextual reference. Researchers engaging with such data are therefore encouraged to reconstruct, as comprehensively as possible, the conditions under which the data were generated. This includes understanding the original research aims and design, the demographic composition of participants, the temporal and geographical setting, and any prevailing socio-political dynamics that may have shaped participants’ responses (Gronmo, 2024).
In the context of AI-assisted thematic analysis, situatedness becomes an even more pressing concern. Generative AI tools, while efficient at processing and summarising large volumes of text, are typically indifferent to context unless explicitly prompted or trained to account for it. Their reliance on patterns and generalisations may lead to the flattening of complex cultural or historical particularities, especially in datasets where these elements are subtle or implicit. As such, the responsibility falls on the researcher to interrogate AI-generated outputs critically and to supplement them with contextual insight drawn from documentation, metadata, and a thorough review of the data’s provenance. Attending to situatedness does more than mitigate interpretive risk; it enriches the depth and credibility of the analysis by anchoring themes in the specific realities of participants’ lives and social environments. It also reflects a commitment to ethical scholarship, whereby participants’ voices and experiences are treated with respect and care, even when mediated through layers of time, technology, or analytical abstraction. Ultimately, situated interpretation fosters a more nuanced and responsible approach to secondary data, which enhances the integrity and relevance of thematic analysis in both academic and applied settings.
Theoretical Contribution
This paper contributes to the advancement of qualitative research methodology by introducing the RIPES model – an acronym for Reflexivity, Interpretation, Procedural consistency, Evaluation, and Situatedness – as a critical and timely extension to existing approaches to thematic analysis. Although thematic analysis has been widely recognised for its flexibility, accessibility, and applicability across a broad range of qualitative research contexts (Braun & Clarke, 2006), its application to secondary data, particularly in environments mediated by digital technologies and AI, presents distinct methodological and epistemological complexities that have yet to be fully theorised or systematically addressed. The RIPES model advances current understandings of thematic analysis by offering a structured yet adaptable framework that explicitly attends to the challenges and opportunities inherent in engaging with secondary data, including AI-generated or AI-processed textual material (Dwivedi et al., 2023). In doing so, it shifts the analytical lens from a sole focus on primary, researcher-generated data towards a recognition of secondary sources as dynamic, meaning-laden sites that require critical and contextually sensitive engagement. These sources, often shaped by layers of interpretation, historical positioning, and technological mediation, call for an approach that is both theoretically grounded and methodologically rigorous.
Reflexivity, as the foundational pillar of the model, centres the researcher’s interpretative agency and challenges any assumption of neutral or detached analysis. Building on the work of Alvesson and Sköldberg (2000, 2009), and further reinforced by Braun and Clarke (2006) and Ozuem et al. (2022), reflexivity within RIPES acknowledges the co-constructed nature of meaning and urges researchers to continually interrogate their own positionalities, theoretical orientations, and the potential influence of technological tools, such as generative AI, on the interpretive process. Interpretation calls for a deep and critical engagement with the data that moves beyond superficial theme identification. In line with Maxwell (2013), the RIPES model conceptualises interpretation as an ongoing dialogue with both the data and their socio-cultural, institutional, and technological contexts. This is particularly crucial when working with secondary data, where researchers are often detached from the original moments of data production and are required to reconstruct meaning through layers of representation and technological mediation. Procedural consistency highlights the importance of systematicity – not as a mechanical or prescriptive adherence to methodological steps, but as an essential condition for transparency, coherence, and analytic integrity. Within the RIPES framework, consistency is achieved through careful documentation, reflexive justification of coding decisions, and iterative validation of themes; these all contribute to the trustworthiness of the analysis, particularly when working with data that may have already undergone transformation through AI processes. Evaluation, reconceptualises the analytic process as one of continual critical appraisal. As argued by Paulus and Marone (2024), knowledge claims derived from secondary data are inherently provisional and situated. Researchers are therefore encouraged to assess not only the credibility and limitations of their datasets but also their own analytic decisions throughout the process, thus acknowledging the layered and contingent nature of thematic insights. Lastly, situatedness urges researchers to locate their interpretations within the broader historical, social, and technological contexts in which the data were originally produced.
Drawing on Denzin and Lincoln (2000), Kukla (2000), Maxwell (2013), Ozuem et al. (2022), and Gronmo (2024), this framework challenges universalist or decontextualised readings, instead it advocates for contextually grounded and ethically attuned interpretations that reflect the specificities of participants’ lived experiences and the environments in which they were documented. In synthesising these five interrelated components, the RIPES model offers both a conceptual and methodological contribution to qualitative enquiry. It enhances the rigour of thematic analysis by explicitly addressing the unique conditions posed by secondary and AI-mediated data, while also extending its theoretical reach to embrace issues of context, power, and technological influence. As such, the RIPES model provides a timely and flexible framework for researchers seeking to conduct meaningful, reflexive, and methodologically robust thematic analysis in contemporary research environments. This contribution invites further empirical application, cross-disciplinary engagement, and theoretical refinement, thus paving the way for a more context-sensitive and critically informed approach to qualitative research.
Limitations and Further Research Directions
While the RIPES model offers a structured and contextually sensitive framework for conducting thematic analysis on secondary data, certain limitations must be acknowledged. Firstly, the model assumes that sufficient contextual information about the original data collection is available, which is often not the case with archival or AI-generated sources. This limitation can impede the accurate application of situatedness and reflexivity. Secondly, the emphasis on procedural consistency, while enhancing rigour, may inadvertently constrain the flexibility needed when engaging with highly fragmented or incomplete secondary datasets. Furthermore, the model places significant demands on the researcher’s critical and interpretive skills, which may vary considerably across different levels of research expertise. Finally, as with all frameworks, RIPES cannot fully eliminate the inherent challenges of working with secondary data, such as issues of authenticity, ethical uncertainty, and data provenance. Researchers must apply the model with caution and adapt it thoughtfully to specific research contexts.
Conclusion
This paper has proposed the RIPES model – comprising Reflexivity, Interpretation, Procedural consistency, Evaluation, and Situatedness – as a structured and theoretically grounded extension to existing approaches to thematic analysis, particularly in the context of secondary and technology-mediated data. It responds to the growing methodological and epistemological challenges facing qualitative research as researchers increasingly engage with complex, layered datasets shaped by digital technologies and AI. The RIPES model emphasises critical reflexivity, deeper interpretative engagement, systematic transparency, continual evaluation, and sensitivity to context. Together, these elements offer researchers a practical yet conceptually rigorous framework for addressing the complexities inherent in secondary data analysis. The model strengthens methodological rigour and promotes ethical responsibility and contextual awareness to ensure the production of credible and meaningful qualitative insights suited to contemporary research environments. Although no single model can encompass the full diversity of research contexts or challenges, RIPES offers a flexible and adaptable foundation that can be sensitively applied to a wide range of empirical settings. Its value lies in integrating established qualitative principles with a clear recognition of the shifting technological landscape in which data are produced, processed, and interpreted. An illustrative example involves analysing public discourse on immigration policies via social media. In this context, the researcher’s positionality – shaped by cultural background, political orientation, and linguistic proficiency – inevitably influences the interpretation of AI-generated thematic patterns. The RIPES framework supports reflexive evaluation of how these positional dimensions intersect with the interpretive process, particularly when engaging with politically sensitive or emotionally charged content. It offers a structured means of documenting interpretive choices with transparency, while encouraging critical engagement with both the possibilities and constraints of AI-assisted analysis. Through its emphasis on reflexivity, evaluation, and contextual sensitivity, the framework enables interpretations that are ethically responsible and empirically grounded, thereby enhancing the trustworthiness of the analytic process.
Researchers working across disciplines are provided with a robust guide for engaging critically with secondary sources to help prevent superficial, decontextualised, or ethically insensitive interpretations. Future research is encouraged to apply, assess, and refine the RIPES model across a range of empirical contexts, including studies involving AI-generated materials, historical archives, and complex digital datasets. Through such applications, researchers can contribute to further theoretical development and enhance the model’s practical utility and relevance. In doing so, the field of qualitative enquiry will be better positioned to meet the demands of an evolving research landscape, thus ensuring that secondary data analysis remains rigorous, ethically sound, and attuned to the social and technological conditions from which meaning is derived.
