Abstract
Introduction
Generative Artificial Intelligence (GenAI) tools are proliferating in number, reshaping how learners interact with and process information in various reading and writing tasks (Chen, Wei et al., 2025; Chen, Zhu et al., 2024; Yang et al., 2023). In response to this growing presence, studies on GenAI-assisted reading and writing have explored areas such as learner perception (Kim et al., 2024; Lin et al., 2024), motivation (Wei, 2023), and skill development (Hsiao & Chang, 2023). However, despite the growing body of research, relatively little attention has been given to the impact of these tools on higher-order thinking, such as critical thinking (Darwin et al., 2024). Critical reading and writing are representations of critical thinking, and these two concepts are closely related. Critical reading requires actively analyzing content, questioning arguments, and identifying biases (Teo, 2014; Tsai et al., 2022). Critical writing conveys the outcomes of this complex thought process, including both initial composition and revision, where arguments are refined, ideas clarified, and evidence strengthened (Abdelrahim, 2023; Teo, 2014). Writing and revising critical reading reports of academic papers is a valuable practice for developing critical reading and writing skills (Wallace & Wray, 2021). However, students often encounter challenges, such as synthesizing multiple sources (List & Lin, 2023) and critically evaluating arguments (Cheung et al., 2024). A previous study (Peng et al., 2022) that implemented an AI chatbot to support critical reading demonstrated its usefulness compared to traditional paper-based scenarios, suggesting that GenAI tools might offer new opportunities to enhance student learning. Nevertheless, there remains a lack of research exploring how students revise their critical reading reports using GenAI tools, particularly within a structured framework, as well as the factors influencing students’ application of these tools. Addressing this gap, by adapting the critical reading and writing framework proposed by Wallace and Wray (2021), this study investigated how and why students interact with GenAI tools to revise their critical reading reports across several dimensions.
Literature Review
The Framework of Critical Reading and Writing
Critical thinking is the process of deliberate judgment, closely tied to reading and writing skills, involving interpretation, analysis, and reasoning while taking into account relevant evidence and context (Facione, 2011). Critical reading involves active engagement with a written piece, going beyond literal understanding to question, evaluate, and reflect on the arguments, assumptions, and potential biases within the text (Paul & Elder, 2004; Tsai et al., 2022). Critical writing involves analyzing and synthesizing information from various sources to construct well-reasoned arguments (Abdelrahim, 2023). A critical reading report is a structured document that captures the reader's analysis of a text, often encompassing a summary of key ideas, a critique of the author's assumptions, and an evaluation of the strength and validity of arguments. This report serves as an example of the integration of critical reading and writing (Wallace & Wray, 2021). Writing such a report is vital to maximizing academic performance, as it nurtures the development of critical thinking skills and equips readers to express what they have read and thought through a deeper process. However, writing an initial report is only the beginning of the learning. Revising these reports is equally important, as it allows for the refinement of arguments, clarification of ideas, and correction of misunderstandings (Chen et al., 2016; Rijlaarsdam et al., 2004). The revision process, especially when informed by feedback or further reflection, enhances the depth and quality of critical engagement with the text (Cutumisu & Schwartz, 2018). However, relatively few existing studies have adopted a comprehensive framework to explore the critical reading and critical reading report revision process, and this study addresses this gap using the framework proposed by Wallace and Wray (2021).
Wallace and Wray (2021) proposed ten dimensions that individuals should consider when reading texts critically and when writing or revising a critical reading report. These include: (1) research aims and investigation, which involve evaluating both the reader's and the author's purpose, as well as the investigation or research methodology employed; (2) research contributions, identifying what the paper has brought to the field; (3) main claims, understanding the key arguments presented; (4) generalization of the findings, considering how widely they can be applied; (5) the quality of evidence, judging the strength and validity of the supporting data; (6) adaptation of the theoretical frameworks, examining the connection between concepts and the main claims; (7) moral and value preferences, understanding how these influence the arguments; (8) the extent to which the claims are supported or challenged by others’ work, reflecting on whether other research supports or disputes the claims; (9) the consistency of the claims with the reader's own experience; and (10) the summary evaluation.
Related Work on Technology-Assisted Critical Reading and Writing
Various technologies have been developed and integrated into critical reading and writing practices. These technologies include online collaborative tools (Cui & Wang, 2024; Kohnke & Har, 2022; Koşar, 2023), multimodal content (Allagui, 2021; Fazio et al., 2022; Salmerón et al., 2020), and, more recently, GenAI chatbots (Cheung et al., 2024; Nguyen et al., 2024; Peng et al., 2022). In summary, we found that: (1) regardless of the type of educational technology, none has been fully applied to develop all dimensions of students’ critical reading and writing; (2) evaluation of the quality of evidence received the most attention among the various dimensions of critical reading and writing; and (3) students’ selective development in certain aspects of critical reading and writing was influenced by factors including reading or writing purposes, external task requirements from teachers, and the understanding of the content or reading materials. Although existing studies have not found career plans to influence students’ selective attention to particular aspects of critical reading or writing, research by Fonteyne et al. (2018) revealed that students tend to prioritize information that aligns with their career plans.
With reference to online collaborative tools, studies show that platforms like Perusall and Zoom enhance students’ critical reading and writing skills by supporting the evaluation of evidence, text quality, or theoretical concepts, influenced by external demands from teachers and the reading materials. Kohnke and Har (2022) integrated Perusall, a platform with features like social annotation (i.e., collaborative commenting on texts) and summaries of confusing content, into their teaching. Results showed a significant improvement in students’ critical reading, with students reporting more thorough evaluations of the quality of narrative stories from multiple perspectives. This improvement is linked to external demands from teachers in guiding students to focus on these specific dimensions. Similarly, Cui and Wang (2024) found that postgraduate students valued Perusall for scaffolding their critical reading of theoretical concepts in lecture slides, with the content of the reading materials guiding their focus. Koşar (2023) used Zoom to examine its impact on university students’ critical reading skills. The study found significant improvements, particularly in students’ ability to evaluate evidence quality from multiple perspectives. While this selective engagement was influenced by external demands from teachers, Zoom played a key role by enabling real-time discussions, peer feedback, and immediate teacher guidance, which helped students critically assess and refine their interpretations of academic texts.
While platforms like Perusall and Zoom foster active engagement with critical reading through peer interaction, collaborative meaning-making, and structured teacher guidance, multimodal tools extend these benefits by offering alternative ways of processing and analyzing information, such as cross-source comparison and encouraging multimedia literacy. Previous studies show that tools like eye gaze or TED videos, as well as multimodal texts, enhance students’ critical reading skills, especially in evaluating evidence and source credibility, though writing outcomes vary. Factors such as reading purposes, task demands, and reading material focus impacted students’ attention distribution. Salmerón et al. (2020) used eye-gaze replay videos to model undergraduates’ evaluation of evidence quality and source assessment on web pages. While critical reading skills in these areas improved, no significant improvement was seen in argumentation writing quality. The students’ reading and writing purpose focused on enhancing their abilities in evaluating evidence quality and assessing sources was likely to contribute to the prioritization of these dimensions. Allagui (2021) combined TED-watching with comment analysis, leading to marked improvements in critical writing of short argumentation, particularly in identifying main claims, assessing evidence quality, evaluating the extent to which claims are supported or challenged by others’ work, and considering the consistency of claims with the learner's experience. Task demands from teachers likely guided this structured approach to critical analysis. Fazio et al. (2022) found that adolescents improved in evaluating authors’ moral and value preferences when engaging with contrasting multimodal texts on environmental change, driven by the ethical and value-laden aspects of the content.
Studies demonstrated that AI chatbots, like critical reading tools and ChatGPT, enhance students’ critical reading and writing by focusing attention on key dimensions such as research aims and investigation, evidence quality, and contributions, with reading and writing purposes and external demands from teachers further shaping their prioritization. Peng et al. (2022) compared a self-developed critical reading chatbot, which guided users with critical thinking questions, to a traditional paper-based guidebook. They found that university students and researchers selectively responded to the chatbot's prompts, focusing on research aims and contributions, which improved their critical reading skills. This emphasis was shaped by their reading purpose, as they aimed to critically review these aspects through interactions with the chatbot. Similarly, Cheung et al. (2024) used ChatGPT to generate scientific texts on environmental issues, observing significant improvements in students’ ability to evaluate evidence quality across different genres, such as expository and argumentative texts, with external teacher interventions further helping students concentrate on the critical assessment. Nguyen et al. (2024) explored the use of ChatGPT and Google Scholar with ten PhD students writing academic essays and found that the students prioritized evaluating the quality of evidence. This focus was likely influenced by external task demands and the essay-writing purpose, as they were required to provide sufficient evidence to enhance the quality of their essays.
Three research gaps have been detected. First, while research on critical reading and writing has expanded with various technologies, little attention has been given to how students revise critical reading reports in a GenAI-assisted context. Few studies have specifically explored GenAI's role in assisting students’ engagement with academic papers, not only in reading but also in writing and revising. Engagement refers to the behaviors students exhibit or the strategies they use during their reading and revision processes, including attention allocation, time spent on revisions, and the use of various sources to verify and validate information (Ballenghein et al., 2020; Hospel et al., 2016; Zhang & Hyland, 2023). This leaves a gap in understanding how GenAI supports students in engaging with the structures and conventions of academic writing during revision. Second, although previous research has shown that students selectively focus on certain dimensions of critical reading and writing when using technology, comprehensive studies are lacking on whether this focus persists within a broader framework. Addressing this gap is crucial for understanding classroom design and fostering students’ higher-order thinking skills. Third, similarly, there is little research on why students selectively engage with different dimensions of critical reading and writing. Understanding these factors would provide valuable insights into the cognitive decision-making processes students undergo when engaging in critical reading and revising reports.
Research Questions
While research on technology-assisted critical reading and writing is growing, little is known about how students revise critical reading reports with GenAI. Existing studies overlook GenAI's role in engagement during reading, writing, and revision. Additionally, it remains unclear whether students’ selective focus on certain dimensions of critical reading persists in a broader framework and why these patterns occur. To address these gaps, we proposed two research questions.
Methodology
Context and Participants
The experiment was conducted at the Foreign Languages College of a provincial university in Jiangxi, South China, which is known for its focus on teacher training. The participants were 22 postgraduate students (21 females and 1 male) majoring in English teaching and applied linguistics, primarily recruited through a professor and participant referrals. Regarding GenAI chatbot use and academic competency, all participants had prior experience with GenAI chatbots. From their first year in university, students took courses such as English Reading, English Writing, Advanced English, and Academic English, all of which emphasized the cultivation of logical reasoning and critical thinking. Their competency in critical reading and writing was assessed through standardized tests, including the Test for English Majors Band 4, which involves critical thinking in argumentative writing (Liu & Stapleton, 2014). Therefore, they generally received over four years of training in academic skills and critical thinking, with most achieving moderate to high competency in critical reading and writing. Their average age was 23.99 years (ranging from 22 to 28), comprising 14 first-year, 5 s-year, and 3 third-year students, with English proficiency from B2 to C1 on the Common European Framework of Reference for Languages (CEFR) scale. All participants voluntarily joined the study, and informed consent was obtained, with ethics approval granted by the Faculty of Applied Sciences, Macao Polytechnic University.
Materials
Academic Papers
To simulate a real academic environment, each student selected an academic paper as the reading material based on their research interest within applied linguistics or language education. The paper had to be an SSCI-indexed empirical research article to ensure quality and relevance.
Participants’ Self-Written Critical Reading Reports
After reading their selected academic paper, each participant wrote a critical reading report, including the paper's title, author(s), and publication details, followed by a critical review based on the ten dimensions of critical reading and writing from Wallace and Wray (2021). Without word limits, participants were encouraged to thoroughly analyze the paper and provide a comprehensive critical review of the selected papers.
GenAI Chatbot and the GenAI-Generated Critical Reading Reports
Ernie Bot, a ChatGPT-like GenAI chatbot designed by the Baidu company (Holmes & Miao, 2023; Wijaya et al., 2024), was adopted as the tool for students to interact with and generate critical reading reports for students’ selected papers. It was selected because it offers similar functions to ChatGPT-4, including text evaluation, generation, and PDF upload. Moreover, it is widely used and easily accessible in the Chinese mainland. For the output of GenAI-generated critical reading reports for each participant, prompts such as “Make a critical review of the contributions of this study. Please respond in English.” were entered into the GenAI chatbot to generate ten reports, each focused on one of the ten dimensions of critical reading and writing. It should be noted that if the chatbot's response was irrelevant to the paper or the corresponding dimension, the prompt was re-entered until a suitable report was generated (Figure 1).

The Dialogue Box of Ernie Bot and the Critical Reading Report Generation.
Procedure
The entire procedure lasted approximately four weeks. At the start, participants took part in a two-hour pre-training session, which included: (1) hands-on experience with the GenAI chatbot, (2) instructions on selecting an academic paper, (3) an introduction to the 10 dimensions of critical reading and writing, along with instructions for writing a critical reading report based on these dimensions, and (4) an overview of the experiment's aims and process. Subsequently, students had three days to select an academic research article, which they sent to the first author, who used Ernie Bot to generate critical reading reports. Participants were then given seven days to read their selected papers and write a critical reading report, and then these reports were submitted to the first author for further experimentation.
The experimental process lasted three weeks, with 22 participants completing it individually. Before the experiment, the first author restated the experiment process. And then participants used a computer, preloaded with their digital papers, self-written critical reading reports, GenAI-generated critical reading reports, and the GenAI chatbot interface, to conduct the experiment. Without time constraints or interference, participants could freely read the papers, review GenAI-generated reports, or interact with the chatbot, with the main aim of revising their self-written critical reading reports. All actions were recorded using screen-capture software. When participants were satisfied with their revisions, they notified the first author. Afterward, a reflective interview was conducted, where participants explained their thinking during the process, and all interview data were tape-recorded (Figure 2).

Flowchart of the Research Procedure.
Data Sources
The two primary data sources were: (1) screen-captured videos recording each student's interactions with the GenAI-generated critical reading reports, GenAI chatbot, digital papers, and their revisions of their self-written critical reading reports; and (2) reflective interview data, capturing participants’ reflections on their thought processes, reasons for revisions, and evaluations of the GenAI-generated reports. The interview questions included: (1) When using GenAI-assisted tools for revisions, which dimensions of your critical reading report were most or least impacted by the tool? (2) What were the reasons for revising this specific dimension?
Data Analysis
Data Analysis for RQ1
To answer RQ1, we conducted an analysis of average revision time and lag sequential analysis (LSA). First, two researchers reviewed the screen-captured videos of students to develop a behavior coding schema (see Table 1). Second, the videos were re-watched and each student's behavior was coded based on the established schema, with the duration and frequency recorded. Third, descriptive data on the revision time were calculated to first investigate the engagement level in each dimension. Fourth, to validate the high engagement levels indicated by the analysis of average revision time, all frequency data were entered into the Generalized Sequential Querier 5.1 (GSEQ) to conduct LSA by calculating
Coding Schema of Behavior.
Data Analysis for RQ2
To address RQ2, a thematic analysis was conducted on participants’ explanations of their engagement in the revision process. The first author transcribed the interview recordings and repeatedly reviewed the transcripts to identify common themes. Subsequently, the second researcher scrutinized the initial themes and coding data, and a face-to-face meeting was held to discuss revisions. After discussions, the two researchers resolved the discrepancies and made final adjustments to the coding schemes. The data were then recoded according to the refined themes, which included (1) reading and revision purposes, referring to the reasons behind selecting an academic paper and engaging in reading and revision activities (Wallace & Wray, 2021); (2) external demands from supervisor, which here refers to extrinsic requirements imposed on students (Robotham & Julian, 2006); (3) career plans as pre-service teachers, referring to the strong belief in choosing teaching as a future career (Dinçer & Seferoğlu, 2020); and (4) literal misunderstanding of the content, which refers to the misinterpretation of a certain topic at the word-processing level (Basaraba et al., 2013; Verdonik, 2010). The disagreement on the coding was resolved in another meeting, and the interrater agreement ratio reached 92.5%, indicating a high level of agreement.
Findings
Quantitative Findings
To answer RQ1, first, we analyzed the average time students spent on revising their critical reading reports in each dimension. Prior research (Ballenghein et al., 2020) indicated that longer reading times are associated with higher cognitive engagement. Table 2 revealed that students spent the most time selectively revising these reports across five dimensions: research aims and investigation (R1) with an average of 588.46 s, main claims (R3) with 374.73 s, adaptation of the theoretical framework (R6) with 363.70 s, research contributions (R2) with 359.55 s, and the quality of evidence (R5) with 352.64 s. However, students demonstrated lower cognitive engagement in revising five other dimensions: generalization of findings (R4) with an average of 345.92 s, moral and value preferences (R7) with 269.34 s, the extent to which the claims are supported or challenged by others’ work (R8) with 245.15 s, the summary evaluation (R10) with 218.10 s, and the consistency with the reader's experience (R9) with 185.66 s.
Average Time Spent Revising the Self-Written Critical Reading Report for Each Dimension.
Second, an LSA was conducted to further validate the high engagement in these dimensions, with a Z-score exceeding 1.96 indicating a significant sequence (Bakeman & Gottman, 1997). After analyzing the 10 dimensions of critical reading and writing, we found that (1) the revision of critical reading report on research aims and investigation (R1), research contributions (R2), the quality of evidence (R5), and adaptation of the theoretical frameworks (R6) typically involved interactions with multiple sources, such as GenAI-generated critical reading reports, the GenAI chatbot, and digital papers, supporting the observation of greater engagement in these dimensions; (2) the revision on main claims (R3) presented a unique case: Despite ranking second in revision time, it involved only interactions with the GenAI-generated critical reading report on main claims, according to LSA results. Based on previous research (List & Alexander, 2017), which noted that disengaged students tend to be satisfied with easy-to-obtain materials rather than seeking and evaluating other sources, we deduced that engagement in this dimension was relatively low, as it did not involve any evaluation of other sources. Therefore, this sequence was excluded from further analysis.
For R1, the participants integrated two sources: the digital paper (SDP) and the GenAI-generated critical reading report on research aims and investigation (G1). The interactions between R1 and G1 and between R1 and SDP were significant, with Z-scores of R1 ↔ G1 at 42.63 and 45.3, and R1 ↔ SDP at 2.07 and 2.09. For R6, participants similarly engaged with two primary sources: the digital paper (SDP) and the GenAI-generated critical reading report on the adaptation of the theoretical frameworks (G6). The relationship between R6 and G6, as well as between R6 and SDP, showed significant results, with Z-scores of 33.92 and 39.58 for R6 ↔ G6, and 3.83 and 4.30 for R6 ↔ SDP. This indicated that when revising these two dimensions, participants checked, evaluated, compared, and integrated information from both GenAI-generated reports and the digital paper, synthesizing insights to refine their revisions effectively.
Regarding R2, the participants applied two sources in revising their critical reading reports: the digital paper (SDP) and the GenAI-generated critical reading report on research contributions (G2). However, the behavioral sequences around R2 differed from those in R1 and R6. Specifically, participants transitioned directly from the SDP to revising their self-written critical reading report on research contributions (R2), without returning to the SDP. The transition sequence of SDP → R2 (
As for R5, revising critical reading reports on the quality of evidence involved communication and negotiation between learners and the GenAI chatbot. The transition from IGC to the self-written critical reading report on the quality of evidence (R5) is highlighted, with a Z-score of 2.38, along with significant interactions between the GenAI-generated critical reading report on the quality of evidence (G5) and R5, with Z-scores of 43.49 and 38.13, respectively. This indicated that, although the GenAI-generated report remains one of the primary sources, additional conversations with the GenAI chatbot reflected learners’ need for further assistance in understanding, clarifying, and evaluating the quality of evidence. This process underscores a more exploratory approach, integrating both established and interactive AI tools to improve revisions (Figure 3).

Behavioral Sequences in the Revision of Self-Written Critical Reading Reports with GenAI-Generated Critical Reading Reports, GenAI Chatbot, and Digital Papers.
Qualitative Findings
The qualitative data were analyzed using thematic analysis to confirm and explain the selective high engagement in four dimensions of revising critical reading reports (R1, R6, R2, and R5), while six other dimensions (R3, R4, R7, R8, R9, and R10) showed relatively low engagement, as identified in the revision time data and LSA. Four interrelated factors contributed to the high engagement: (1) learners’ reading and revision purposes, which included (a) gaining knowledge to support their research, writing, or teaching, and (b) exploring the rationale behind selected papers, (2) external demands from supervisors, such as selecting research topics, fulfilling thesis requirements, or solving problems in substitute teaching, (3) career plans as pre-service teachers, where reading pedagogical contributions and revising from different sources were seen as beneficial for future teaching, and (4) literal misunderstandings of the content, such as misinterpreting what a critical reading report on evidence quality should involve.
Specifically, the high engagement in revising research aims and investigation (R1) was driven by the purposes of expanding knowledge for selecting research topics and preparing proposals or theses. External demands, such as supervisors’ approval for projects and funding applications, further reinforced this engagement. Similarly, revising the adaptation of theoretical frameworks (R6) was influenced by students’ purposes of exploring the rationale behind framework choices, enhancing literature reviews, and preparing thesis content. Supervisor demands regarding topic selection and proposal defenses also increased attention to this dimension. Engagement in revising research contributions (R2) was shaped by students’ career plans as pre-service teachers, especially for those pursuing teaching roles. This career focus influenced their purposes to address teaching challenges, meet supervisor demands, and improve understanding of contributions relevant to instructional settings. Differently, engagement in revising the quality of evidence (R5) was affected by literal misunderstandings of the dimension's scope, presenting cognitive challenges as students sought a clearer understanding of evidence quality (Table 3).
Four Factors Explaining the Selective High Engagement With Four Dimensions.
First, regarding the dimension of research aims and investigation (including the research methodology), interviewees valued revising this dimension (R1) for three main reasons: (1) First-year postgraduate students needed to select a research topic and design an experiment, often for innovation fund applications, which required extensive reading and evaluation of research designs (Student J). (2) Second-year students focused on fulfilling thesis proposals, emphasizing a thorough understanding of research aims and methodology to improve their writing (Student C). (3) Third-year students prioritized revisions, as their thesis writing on research aims and investigation would be scrutinized by experts during the defense process. I am looking into the research topic of how to integrate reading and writing together, and particularly focusing on how experimental activities are conducted. (Student J) I am about to have my thesis defense soon, and I am still working on the thesis proposal, which I need to send to my supervisor. (Student C)
Second, high engagement in revising the adaptation of theoretical frameworks (R6) can be attributed to three key reasons. First, some first-year students aimed to understand the rationale behind selecting certain frameworks in their digital papers, influenced by the external demands from supervisors to choose a research topic. Students reported that some scholars often did not explain their framework choices, so they emphasized deeper analysis through integrated reading and writing to justify their own topic selections (Student J). Second, participants sought to improve their literature review and research design, particularly the theoretical framework, due to thesis proposal defense demands, where writing quality in this area is heavily scrutinized. Failure to meet academic standards could delay the defense, prompting them to engage in multifaceted revisions (Student M). Lastly, students nearing graduation focused on improving thesis quality through reading and revision using different sources. There was a question I had when planning to research this topic. Looking at the article, I wondered why the author specifically chose these two concepts, but the reasons weren’t clearly explained … The second paragraph of the AI report (on the adaptation of theoretical frameworks) inspired me. (Student J) I have read this article many times, but since I am preparing for my thesis proposal defense, now, my focus is still on the research design, introduction, and literature review. (Student M)
Third, the importance of revising research contributions (R2) can be explained in three ways. First-year students aimed to gain teaching insights from pedagogical contributions in GenAI reports, which they had previously overlooked, linking this to their career plans. This shift from academic to pedagogical contributions motivated them to integrate both for their future teaching (Student E). Second-year students focused on solving immediate teaching challenges, such as substitute teaching or internships, where analyzing contributions helped them perform well and pass evaluations. Success in substitute teaching could lead to full-time positions, as experienced by Student L, who sought ways to motivate students during her internship. Lastly, senior students used integrated reading and writing on research contributions to improve their thesis revisions, driven by the external demand to meet academic standards. I reviewed the AI-provided document, which focused more on the pedagogical aspect … Previously, I focused more on the academic side, but considering my future as a teacher, I’m now more concerned about its pedagogical insights. I believe research should ideally offer practical significance. (Student E) I focus on the pedagogical aspect of the contributions because I recently started substitute teaching, also an internship, at a middle school. It's my first time teaching large classes, and it made me think about how to stimulate students’ interest. (Student L)
Fourth, it showed that students used interactive reading and revision, including communication with the GenAI chatbot, the GenAI-generated critical reading report, and revisions of their self-written critical reading reports, to engage deeply with the dimension of the quality of evidence (R5). This high engagement generally stemmed from literal misunderstandings, particularly misinterpreting what the quality of evidence entails (e.g., students H misunderstood “the quality of evidence” as referring to the procedural details of a study, such as the number of participants, rather than the more abstract and analytical evaluation required for this dimension). Interviewees reported lacking confidence in analyzing this dimension and, when interacting with materials, noticed gaps between their understanding and the actual requirements (e.g., Student G). They frequently relied on the GenAI chatbot for clarification, indicating that this dimension posed cognitive challenges regardless of their year level. Ultimately, students recognized that multiple-source reading and repeated revisions improved their understanding of evidence quality (Student H). Regarding the analysis of evidence quality, I thought about finding viewpoints from literature reviews. Honestly, I didn’t have confidence in my answers, and later, when I saw AI's answers, I felt I might have been answering in the wrong direction. (Student G) I misunderstood the topic. It seems I was discussing how the experiment operates, like how many students were invited and how evaluations were conducted. I saw this as evidence. However, after looking at the AI, I realized its evidence was different from mine. So, I made many revisions. (Student H)
Lastly, we analyzed why students showed lower engagement in revising six dimensions of the critical reading report. Specifically, low engagement in revising main claims (R3) and the generalization of findings (R4) was attributed to a similar understanding of the content, reducing the need for further revision. Limited engagement in revising moral and value preferences (R7), the extent to which claims are supported or challenged by others’ work (R8), and the summary evaluation (R10) was linked to a misalignment between reading and revision purposes, compounded by a lack of external demands from supervisors. The low engagement in revising consistency with the reader's experience (R9) was distinct, as the GenAI-generated report's analysis did not align with students’ career plans (Table 4).
Four Factors Explaining the Low Engagement With Six Dimensions.
Compared to the analysis of the quality of evidence, which might lead to misunderstandings, some students noted that their self-written critical reading reports on main claims and the generalization of findings were similar to the GenAI-generated report, reducing the need for revisions. The limitations of the “Main Claims” section—didn’t I already have them? I remember I have presented them … My critical analysis of the degree of generalization already mentioned these points, so I didn’t refer to this part of the GenAI-generated report. (Student H)
For dimensions such as moral and value preferences, the extent to which claims are supported or challenged by others’ work, and the summary evaluation, most participants disregarded them because supervisors did not require these in proposals. This misalignment led to lower engagement with these dimensions (R7, R8, and R10) in revisions (Student M). For instance, critically analyzing the extent to which the claims are supported or challenged by others’ work might not be related to my specific reading purpose, so I didn’t examine it in detail. (Student M)
Regarding the dimension of consistency with the reader's experience, Student V served as a representative case, as she saw herself as both a pre-service teacher and a scholar, while the GenAI report focused solely on the student role. Since Student V had already articulated her dual identity in her own writing, she may not have perceived the GenAI-generated feedback as relevant or necessary for revision. This misalignment likely contributed to the low engagement with this dimension, as participants, with a high level of confidence, did not perceive a need to adjust their self-positioning based on AI's narrower perspective. Because the experience analyzed by AI is different from mine, I think it considers, perhaps, the students’ perspectives and thoughts. However, I see myself as both a researcher and a teacher. (Student V)
Discussion
The study investigated which dimensions of critical reading and writing received greater student engagement in a GenAI-assisted reading and revision context, and explored the reasons behind this increased engagement. Regarding RQ1, students selectively focused on revising their critical reading reports in four key dimensions: research aims and investigation, research contributions, quality of evidence, and adaptation of theoretical frameworks. For RQ2, four factors, including reading and revision purposes, external demands from supervisors, career plans as pre-service teachers, and literal misunderstandings of the content, were associated with prioritizing these dimensions over others.
Firstly, in response to RQ1, it was revealed that GenAI tools have not been comprehensively utilized to assist students in revising their critical reading reports. Prior research on online collaborative tools (Cui & Wang, 2024; Kohnke & Har, 2022; Koşar, 2023) and multimodal content (Allagui, 2021; Fazio et al., 2022; Salmerón et al., 2020) similarly showed that these technologies were not fully applied to support critical reading and writing across all ten dimensions, with emphasis primarily on the quality of evidence and adaptation of theoretical frameworks. Recent studies using AI chatbots (Cheung et al., 2024; Nguyen et al., 2024; Peng et al., 2022) further confirmed selective engagement in specific dimensions, such as the quality of evidence, research aims and investigation, and research contributions. Our study, one of the first to employ a comprehensive framework, indicated that selective engagement was also evident during the revision of academic critical reading reports with GenAI tools. Most importantly, it revealed that educational technologies, including GenAI, have not fundamentally reshaped students’ engagement or decision-making in critical thinking.
Secondly, students showed relatively low engagement in revising their critical reading reports across six dimensions: main claims, generalization of findings, moral and value preferences, the extent to which claims are supported or challenged by others’ work, consistency with the reader's own experiences, and the summary evaluation. This low engagement can be attributed to several factors, including the misalignment between students’ reading and revision purposes, external demands from supervisors, career plans as pre-service teachers, and the similar understanding between humans and GenAI-generated reports, which likely led students to neglect these dimensions. Previous research on AI-assisted critical reading and writing (Cheung et al., 2024; Nguyen et al., 2024; Peng et al., 2022) similarly suggested that these dimensions did not receive sufficient attention due to participants’ reading and writing purposes. These findings support the view that AI cannot comprehensively address all dimensions of reading (Burriss & Leander, 2024) and writing (Wang, 2022), particularly when dealing with complex moral or societal factors (Böhm et al., 2023), such as inferring the author's moral and value preferences. Moreover, human agency remains central, as students display personal preferences while collaborating with or learning with AI in assisting their higher-order thinking, rather than being replaced by it (Darvishi et al., 2024).
Thirdly, to tackle RQ2, we identified four factors influencing students’ high engagement with GenAI tools in revising certain dimensions of critical reading reports. Three of these factors were also found in previous studies: (1) Reading and writing purposes, which serve as major influences on students’ critical reading (Salmerón et al., 2020), guiding them to engage with dimensions most relevant to their academic goals. For example, Koşar (2023) found that the purpose of assessing a paper's evidence quality significantly enhanced students’ engagement in evaluating the adequacy of supporting details. (2) External demands from teachers, which often direct students’ attention to specific dimensions to meet assessment standards. Cheung et al. (2024) found that teachers explicitly required students to focus on evaluating the quality of evidence in AI-generated texts. (3) Understanding of the content or materials, as students may engage more deeply when they realize misunderstandings and try to correct errors. Additionally, if the material covers unfamiliar content, students may need to invest more effort in learning it. For example, Cui and Wang (2024) found that lecture slides focused on theoretical frameworks and concepts reinforced students’ selective engagement with this dimension.
In addition to these factors, this study identified another factor influencing students’ engagement in critical reading and revision: their career plans as pre-service teachers. Participants consistently evaluated their reading and revision outcomes, emphasizing dimensions that aligned with their strong beliefs about becoming teachers. This finding aligned with earlier views that pre-service teachers’ beliefs significantly impact their learning focus and preferences (Han et al., 2017; Martinez et al., 2024). Participants in this study considered four factors holistically, allowing them to revise their critical reading reports and selectively use GenAI tools to maximize learning outcomes, demonstrating their proficient self-regulation skills, including setting appropriate goals, utilizing self-monitoring and evaluation strategies, which echoes the results of Tam (2024).
Conclusion
This study contributes to the growing literature on GenAI-assisted academic reading and writing by providing a nuanced understanding of the process of GenAI tools in facilitating L2 learners’ engagement with critical reading and writing dimensions during report revision. It highlights the interplay between human agency and GenAI tools in fostering self-regulation strategies that support critical thinking development, emphasizing students’ active role in navigating and optimizing GenAI tools to enhance their analytical, evaluative, and expressive skills.
This study is significant in demonstrating how GenAI can support but not replace learners’ critical reading and writing processes, reinforcing the importance of student agency in AI-assisted learning. By identifying dimensions with greater engagement and those overlooked, it offers insights for educators, AI developers, researchers, and policymakers to design AI-enhanced learning environments that foster deeper critical engagement while maintaining a balance between AI support and learner autonomy.
Four key implications emerged in this research. First, since students selectively engaged in revising critical reading reports of certain dimensions in the GenAI context, AI developers should design tools with features that prompt users to address neglected areas, such as built-in prompts or scaffolding to encourage broader critical thinking. Second, while AI can assist students in revising critical reading reports, it cannot replace human intervention. Teachers must guide students in effectively using AI to develop critical thinking skills. Teacher involvement, alongside GenAI, can better support students by providing scaffolding, ensuring that AI enhances higher-order thinking rather than acting as a standalone solution. Third, future research should explore self-regulation and metacognitive strategies in GenAI-assisted learning, investigating how these strategies interact with AI tools to empower students to make autonomous, informed decisions during critical reading and revision, leading to more personalized and effective AI-supported educational interventions. Fourth, policymakers should establish guidelines that encourage the use of GenAI tools as an aid in developing critical reading and writing skills without mandating their use or positioning AI as the primary agent. Instead, policies should support a framework that promotes student agency, with teachers providing guidance alongside GenAI assistance.
Two limitations remained in the study. First, the sample size of 22 postgraduate students and the relatively short 4-week duration limit the generalizability of the findings, and future studies should include larger samples and longer timeframes. Additionally, this study focused on a specific cohort of students, which may also restrict the applicability of the results, as students with different educational backgrounds may respond differently. Second, the majority of participants were female, which may have influenced the results, though prior research suggested a similar gender preference for English majors in Asian countries (Kobayashi, 2002).
