Sage Journals: Discover world-class research

Abstract

Purpose

This study addresses the gap in research on how Generative Artificial Intelligence (GenAI) tools support L2 learners in critical reading and writing. Specifically, it examines which dimensions of critical reading and writing, based on Wallace and Wray’s framework, receive the most student engagement during revision.

Design/Approach/Methods

This study utilized lag sequential analysis and thematic analysis to evaluate the engagement of 22 postgraduate students as they revised self-written critical reading reports using GenAI tools, focusing on ten dimensions of critical reading and writing.

Findings

Students demonstrated heightened engagement in the revision of four dimensions: research aims and investigation, research contributions, quality of evidence, and adaptation of theoretical frameworks. This process involved interactions with the chatbot, GenAI-generated critical reading reports, and digital papers. Factors influencing this selective engagement included reading and revision purposes, external demands from supervisors, career plans as pre-service teachers, and literal misunderstandings of the content.

Originality/Value

This study adds to the literature by revealing the selective use of GenAI tools in academic settings, highlighting the crucial role of student agency in the complex decision-making processes involved in critical thinking.

Keywords

critical reading and writing engagement strategy generative artificial intelligence

Introduction

Generative Artificial Intelligence (GenAI) tools are proliferating in number, reshaping how learners interact with and process information in various reading and writing tasks (Chen, Wei et al., 2025; Chen, Zhu et al., 2024; Yang et al., 2023). In response to this growing presence, studies on GenAI-assisted reading and writing have explored areas such as learner perception (Kim et al., 2024; Lin et al., 2024), motivation (Wei, 2023), and skill development (Hsiao & Chang, 2023). However, despite the growing body of research, relatively little attention has been given to the impact of these tools on higher-order thinking, such as critical thinking (Darwin et al., 2024). Critical reading and writing are representations of critical thinking, and these two concepts are closely related. Critical reading requires actively analyzing content, questioning arguments, and identifying biases (Teo, 2014; Tsai et al., 2022). Critical writing conveys the outcomes of this complex thought process, including both initial composition and revision, where arguments are refined, ideas clarified, and evidence strengthened (Abdelrahim, 2023; Teo, 2014). Writing and revising critical reading reports of academic papers is a valuable practice for developing critical reading and writing skills (Wallace & Wray, 2021). However, students often encounter challenges, such as synthesizing multiple sources (List & Lin, 2023) and critically evaluating arguments (Cheung et al., 2024). A previous study (Peng et al., 2022) that implemented an AI chatbot to support critical reading demonstrated its usefulness compared to traditional paper-based scenarios, suggesting that GenAI tools might offer new opportunities to enhance student learning. Nevertheless, there remains a lack of research exploring how students revise their critical reading reports using GenAI tools, particularly within a structured framework, as well as the factors influencing students’ application of these tools. Addressing this gap, by adapting the critical reading and writing framework proposed by Wallace and Wray (2021), this study investigated how and why students interact with GenAI tools to revise their critical reading reports across several dimensions.

Literature Review

The Framework of Critical Reading and Writing

Critical thinking is the process of deliberate judgment, closely tied to reading and writing skills, involving interpretation, analysis, and reasoning while taking into account relevant evidence and context (Facione, 2011). Critical reading involves active engagement with a written piece, going beyond literal understanding to question, evaluate, and reflect on the arguments, assumptions, and potential biases within the text (Paul & Elder, 2004; Tsai et al., 2022). Critical writing involves analyzing and synthesizing information from various sources to construct well-reasoned arguments (Abdelrahim, 2023). A critical reading report is a structured document that captures the reader's analysis of a text, often encompassing a summary of key ideas, a critique of the author's assumptions, and an evaluation of the strength and validity of arguments. This report serves as an example of the integration of critical reading and writing (Wallace & Wray, 2021). Writing such a report is vital to maximizing academic performance, as it nurtures the development of critical thinking skills and equips readers to express what they have read and thought through a deeper process. However, writing an initial report is only the beginning of the learning. Revising these reports is equally important, as it allows for the refinement of arguments, clarification of ideas, and correction of misunderstandings (Chen et al., 2016; Rijlaarsdam et al., 2004). The revision process, especially when informed by feedback or further reflection, enhances the depth and quality of critical engagement with the text (Cutumisu & Schwartz, 2018). However, relatively few existing studies have adopted a comprehensive framework to explore the critical reading and critical reading report revision process, and this study addresses this gap using the framework proposed by Wallace and Wray (2021).

Wallace and Wray (2021) proposed ten dimensions that individuals should consider when reading texts critically and when writing or revising a critical reading report. These include: (1) research aims and investigation, which involve evaluating both the reader's and the author's purpose, as well as the investigation or research methodology employed; (2) research contributions, identifying what the paper has brought to the field; (3) main claims, understanding the key arguments presented; (4) generalization of the findings, considering how widely they can be applied; (5) the quality of evidence, judging the strength and validity of the supporting data; (6) adaptation of the theoretical frameworks, examining the connection between concepts and the main claims; (7) moral and value preferences, understanding how these influence the arguments; (8) the extent to which the claims are supported or challenged by others’ work, reflecting on whether other research supports or disputes the claims; (9) the consistency of the claims with the reader's own experience; and (10) the summary evaluation.

Research Questions

While research on technology-assisted critical reading and writing is growing, little is known about how students revise critical reading reports with GenAI. Existing studies overlook GenAI's role in engagement during reading, writing, and revision. Additionally, it remains unclear whether students’ selective focus on certain dimensions of critical reading persists in a broader framework and why these patterns occur. To address these gaps, we proposed two research questions.

RQ1: What dimensions of critical reading and writing do students demonstrate greater engagement in when interacting with GenAI tools to revise their critical reading reports?

RQ2: Why do students engage more deeply with GenAI tools when revising specific dimensions of their critical reading reports?

Methodology

Context and Participants

The experiment was conducted at the Foreign Languages College of a provincial university in Jiangxi, South China, which is known for its focus on teacher training. The participants were 22 postgraduate students (21 females and 1 male) majoring in English teaching and applied linguistics, primarily recruited through a professor and participant referrals. Regarding GenAI chatbot use and academic competency, all participants had prior experience with GenAI chatbots. From their first year in university, students took courses such as English Reading, English Writing, Advanced English, and Academic English, all of which emphasized the cultivation of logical reasoning and critical thinking. Their competency in critical reading and writing was assessed through standardized tests, including the Test for English Majors Band 4, which involves critical thinking in argumentative writing (Liu & Stapleton, 2014). Therefore, they generally received over four years of training in academic skills and critical thinking, with most achieving moderate to high competency in critical reading and writing. Their average age was 23.99 years (ranging from 22 to 28), comprising 14 first-year, 5 s-year, and 3 third-year students, with English proficiency from B2 to C1 on the Common European Framework of Reference for Languages (CEFR) scale. All participants voluntarily joined the study, and informed consent was obtained, with ethics approval granted by the Faculty of Applied Sciences, Macao Polytechnic University.

Materials

Academic Papers

To simulate a real academic environment, each student selected an academic paper as the reading material based on their research interest within applied linguistics or language education. The paper had to be an SSCI-indexed empirical research article to ensure quality and relevance.

Participants’ Self-Written Critical Reading Reports

After reading their selected academic paper, each participant wrote a critical reading report, including the paper's title, author(s), and publication details, followed by a critical review based on the ten dimensions of critical reading and writing from Wallace and Wray (2021). Without word limits, participants were encouraged to thoroughly analyze the paper and provide a comprehensive critical review of the selected papers.

GenAI Chatbot and the GenAI-Generated Critical Reading Reports

Ernie Bot, a ChatGPT-like GenAI chatbot designed by the Baidu company (Holmes & Miao, 2023; Wijaya et al., 2024), was adopted as the tool for students to interact with and generate critical reading reports for students’ selected papers. It was selected because it offers similar functions to ChatGPT-4, including text evaluation, generation, and PDF upload. Moreover, it is widely used and easily accessible in the Chinese mainland. For the output of GenAI-generated critical reading reports for each participant, prompts such as “Make a critical review of the contributions of this study. Please respond in English.” were entered into the GenAI chatbot to generate ten reports, each focused on one of the ten dimensions of critical reading and writing. It should be noted that if the chatbot's response was irrelevant to the paper or the corresponding dimension, the prompt was re-entered until a suitable report was generated (Figure 1).

Figure 1.

The Dialogue Box of Ernie Bot and the Critical Reading Report Generation.

Procedure

The entire procedure lasted approximately four weeks. At the start, participants took part in a two-hour pre-training session, which included: (1) hands-on experience with the GenAI chatbot, (2) instructions on selecting an academic paper, (3) an introduction to the 10 dimensions of critical reading and writing, along with instructions for writing a critical reading report based on these dimensions, and (4) an overview of the experiment's aims and process. Subsequently, students had three days to select an academic research article, which they sent to the first author, who used Ernie Bot to generate critical reading reports. Participants were then given seven days to read their selected papers and write a critical reading report, and then these reports were submitted to the first author for further experimentation.

The experimental process lasted three weeks, with 22 participants completing it individually. Before the experiment, the first author restated the experiment process. And then participants used a computer, preloaded with their digital papers, self-written critical reading reports, GenAI-generated critical reading reports, and the GenAI chatbot interface, to conduct the experiment. Without time constraints or interference, participants could freely read the papers, review GenAI-generated reports, or interact with the chatbot, with the main aim of revising their self-written critical reading reports. All actions were recorded using screen-capture software. When participants were satisfied with their revisions, they notified the first author. Afterward, a reflective interview was conducted, where participants explained their thinking during the process, and all interview data were tape-recorded (Figure 2).

Figure 2.

Flowchart of the Research Procedure.

Data Sources

The two primary data sources were: (1) screen-captured videos recording each student's interactions with the GenAI-generated critical reading reports, GenAI chatbot, digital papers, and their revisions of their self-written critical reading reports; and (2) reflective interview data, capturing participants’ reflections on their thought processes, reasons for revisions, and evaluations of the GenAI-generated reports. The interview questions included: (1) When using GenAI-assisted tools for revisions, which dimensions of your critical reading report were most or least impacted by the tool? (2) What were the reasons for revising this specific dimension?

Data Analysis

Data Analysis for RQ1

To answer RQ1, we conducted an analysis of average revision time and lag sequential analysis (LSA). First, two researchers reviewed the screen-captured videos of students to develop a behavior coding schema (see Table 1). Second, the videos were re-watched and each student's behavior was coded based on the established schema, with the duration and frequency recorded. Third, descriptive data on the revision time were calculated to first investigate the engagement level in each dimension. Fourth, to validate the high engagement levels indicated by the analysis of average revision time, all frequency data were entered into the Generalized Sequential Querier 5.1 (GSEQ) to conduct LSA by calculating Z-scores (with a threshold of 1.96, beyond which the behavioral sequence is statistically significant with a p-value < .05). According to the results provided by GSEQ, a diagram depicting interactive critical reading and revision was drawn.

Table 1.

Coding Schema of Behavior.

Behavioral Codes	Description of Behavior	Examples
SDP	Scrutinizing the digital papers
IGC	Interacting with the GenAI Chatbot
R1–R10	Revising their self-written critical reading report	R represents the readers “revising the self-written critical analysis,” with R1 focusing on research aims and investigation, R2 on contributions, R3 on main claims, continuing in this manner through R10.
G1–G10	Reviewing the GenAI-generated critical reading report	G generally represents the readers “reviewing GenAI-generated critical reading reports,” with G1 focusing on the first dimension of research aims and investigation, G2 on contributions, G3 on main claims, continuing in this manner through G10.

Data Analysis for RQ2

To address RQ2, a thematic analysis was conducted on participants’ explanations of their engagement in the revision process. The first author transcribed the interview recordings and repeatedly reviewed the transcripts to identify common themes. Subsequently, the second researcher scrutinized the initial themes and coding data, and a face-to-face meeting was held to discuss revisions. After discussions, the two researchers resolved the discrepancies and made final adjustments to the coding schemes. The data were then recoded according to the refined themes, which included (1) reading and revision purposes, referring to the reasons behind selecting an academic paper and engaging in reading and revision activities (Wallace & Wray, 2021); (2) external demands from supervisor, which here refers to extrinsic requirements imposed on students (Robotham & Julian, 2006); (3) career plans as pre-service teachers, referring to the strong belief in choosing teaching as a future career (Dinçer & Seferoğlu, 2020); and (4) literal misunderstanding of the content, which refers to the misinterpretation of a certain topic at the word-processing level (Basaraba et al., 2013; Verdonik, 2010). The disagreement on the coding was resolved in another meeting, and the interrater agreement ratio reached 92.5%, indicating a high level of agreement.

Findings

Quantitative Findings

To answer RQ1, first, we analyzed the average time students spent on revising their critical reading reports in each dimension. Prior research (Ballenghein et al., 2020) indicated that longer reading times are associated with higher cognitive engagement. Table 2 revealed that students spent the most time selectively revising these reports across five dimensions: research aims and investigation (R1) with an average of 588.46 s, main claims (R3) with 374.73 s, adaptation of the theoretical framework (R6) with 363.70 s, research contributions (R2) with 359.55 s, and the quality of evidence (R5) with 352.64 s. However, students demonstrated lower cognitive engagement in revising five other dimensions: generalization of findings (R4) with an average of 345.92 s, moral and value preferences (R7) with 269.34 s, the extent to which the claims are supported or challenged by others’ work (R8) with 245.15 s, the summary evaluation (R10) with 218.10 s, and the consistency with the reader's experience (R9) with 185.66 s.

Table 2.

Average Time Spent Revising the Self-Written Critical Reading Report for Each Dimension.

Behavior	Mean Time (s)	SD	Skewness
R1	588.46	341.12	−0.07
R3	374.73	177.34	0.23
R6	363.70	232.49	1.06
R2	359.55	299.50	1.01
R5	352.64	198.91	−0.06
R4	345.92	172.76	0.19
R7	269.34	200.71	1.36
R8	245.15	156.42	1.09
R10	218.10	169.16	1.13
R9	185.66	230.16	1.66

Second, an LSA was conducted to further validate the high engagement in these dimensions, with a Z-score exceeding 1.96 indicating a significant sequence (Bakeman & Gottman, 1997). After analyzing the 10 dimensions of critical reading and writing, we found that (1) the revision of critical reading report on research aims and investigation (R1), research contributions (R2), the quality of evidence (R5), and adaptation of the theoretical frameworks (R6) typically involved interactions with multiple sources, such as GenAI-generated critical reading reports, the GenAI chatbot, and digital papers, supporting the observation of greater engagement in these dimensions; (2) the revision on main claims (R3) presented a unique case: Despite ranking second in revision time, it involved only interactions with the GenAI-generated critical reading report on main claims, according to LSA results. Based on previous research (List & Alexander, 2017), which noted that disengaged students tend to be satisfied with easy-to-obtain materials rather than seeking and evaluating other sources, we deduced that engagement in this dimension was relatively low, as it did not involve any evaluation of other sources. Therefore, this sequence was excluded from further analysis.

For R1, the participants integrated two sources: the digital paper (SDP) and the GenAI-generated critical reading report on research aims and investigation (G1). The interactions between R1 and G1 and between R1 and SDP were significant, with Z-scores of R1 ↔ G1 at 42.63 and 45.3, and R1 ↔ SDP at 2.07 and 2.09. For R6, participants similarly engaged with two primary sources: the digital paper (SDP) and the GenAI-generated critical reading report on the adaptation of the theoretical frameworks (G6). The relationship between R6 and G6, as well as between R6 and SDP, showed significant results, with Z-scores of 33.92 and 39.58 for R6 ↔ G6, and 3.83 and 4.30 for R6 ↔ SDP. This indicated that when revising these two dimensions, participants checked, evaluated, compared, and integrated information from both GenAI-generated reports and the digital paper, synthesizing insights to refine their revisions effectively.

Regarding R2, the participants applied two sources in revising their critical reading reports: the digital paper (SDP) and the GenAI-generated critical reading report on research contributions (G2). However, the behavioral sequences around R2 differed from those in R1 and R6. Specifically, participants transitioned directly from the SDP to revising their self-written critical reading report on research contributions (R2), without returning to the SDP. The transition sequence of SDP → R2 (Z = 4.34), along with the significant interactions between R2 and G2 (Z = 39.48) and G2 and R2 (Z = 35.35), was notable. This suggested that, after consulting the papers, students were satisfied with their understanding and did not feel the need to revisit the original text. Instead, they prioritized engaging with the GenAI-generated report to gain deeper critical insights, thus enhancing their analysis and refining their revisions of this section.

As for R5, revising critical reading reports on the quality of evidence involved communication and negotiation between learners and the GenAI chatbot. The transition from IGC to the self-written critical reading report on the quality of evidence (R5) is highlighted, with a Z-score of 2.38, along with significant interactions between the GenAI-generated critical reading report on the quality of evidence (G5) and R5, with Z-scores of 43.49 and 38.13, respectively. This indicated that, although the GenAI-generated report remains one of the primary sources, additional conversations with the GenAI chatbot reflected learners’ need for further assistance in understanding, clarifying, and evaluating the quality of evidence. This process underscores a more exploratory approach, integrating both established and interactive AI tools to improve revisions (Figure 3).

Figure 3.

Behavioral Sequences in the Revision of Self-Written Critical Reading Reports with GenAI-Generated Critical Reading Reports, GenAI Chatbot, and Digital Papers.

Qualitative Findings

The qualitative data were analyzed using thematic analysis to confirm and explain the selective high engagement in four dimensions of revising critical reading reports (R1, R6, R2, and R5), while six other dimensions (R3, R4, R7, R8, R9, and R10) showed relatively low engagement, as identified in the revision time data and LSA. Four interrelated factors contributed to the high engagement: (1) learners’ reading and revision purposes, which included (a) gaining knowledge to support their research, writing, or teaching, and (b) exploring the rationale behind selected papers, (2) external demands from supervisors, such as selecting research topics, fulfilling thesis requirements, or solving problems in substitute teaching, (3) career plans as pre-service teachers, where reading pedagogical contributions and revising from different sources were seen as beneficial for future teaching, and (4) literal misunderstandings of the content, such as misinterpreting what a critical reading report on evidence quality should involve.

Specifically, the high engagement in revising research aims and investigation (R1) was driven by the purposes of expanding knowledge for selecting research topics and preparing proposals or theses. External demands, such as supervisors’ approval for projects and funding applications, further reinforced this engagement. Similarly, revising the adaptation of theoretical frameworks (R6) was influenced by students’ purposes of exploring the rationale behind framework choices, enhancing literature reviews, and preparing thesis content. Supervisor demands regarding topic selection and proposal defenses also increased attention to this dimension. Engagement in revising research contributions (R2) was shaped by students’ career plans as pre-service teachers, especially for those pursuing teaching roles. This career focus influenced their purposes to address teaching challenges, meet supervisor demands, and improve understanding of contributions relevant to instructional settings. Differently, engagement in revising the quality of evidence (R5) was affected by literal misunderstandings of the dimension's scope, presenting cognitive challenges as students sought a clearer understanding of evidence quality (Table 3).

Table 3.

Four Factors Explaining the Selective High Engagement With Four Dimensions.

Dimensions of Revising Critical Reading Reports	Related Factors	Elaboration
Research aims and investigation (R1)	Reading and revision purpose External demands	Students focused intensively on selecting research topics, refining thesis proposals, and preparing for thesis defense, depending on their year level.
The adaptation of the theoretical frameworks (R6)	Reading and revision purpose External demands	Students engaged intensively to justify theoretical framework selections, improve literature reviews, and meet thesis proposal or defense requirements, depending on their year level.
Research contributions (R2)	Reading and revision purpose External demands Career plans as pre-service teachers	Students engaged deeply to enhance future and current teaching, improve writing in this dimension, meet external demands (e.g., substitute teaching and thesis defense), and align with their career plans of becoming teachers.
The quality of evidence (R5)	Literal misunderstanding of the content	Students engaged deeply due to easily misunderstanding the quality of evidence, and relying on the GenAI chatbot and revisions to clarify and improve comprehension.

First, regarding the dimension of research aims and investigation (including the research methodology), interviewees valued revising this dimension (R1) for three main reasons: (1) First-year postgraduate students needed to select a research topic and design an experiment, often for innovation fund applications, which required extensive reading and evaluation of research designs (Student J). (2) Second-year students focused on fulfilling thesis proposals, emphasizing a thorough understanding of research aims and methodology to improve their writing (Student C). (3) Third-year students prioritized revisions, as their thesis writing on research aims and investigation would be scrutinized by experts during the defense process.

I am looking into the research topic of how to integrate reading and writing together, and particularly focusing on how experimental activities are conducted. (Student J)

I am about to have my thesis defense soon, and I am still working on the thesis proposal, which I need to send to my supervisor. (Student C)

Second, high engagement in revising the adaptation of theoretical frameworks (R6) can be attributed to three key reasons. First, some first-year students aimed to understand the rationale behind selecting certain frameworks in their digital papers, influenced by the external demands from supervisors to choose a research topic. Students reported that some scholars often did not explain their framework choices, so they emphasized deeper analysis through integrated reading and writing to justify their own topic selections (Student J). Second, participants sought to improve their literature review and research design, particularly the theoretical framework, due to thesis proposal defense demands, where writing quality in this area is heavily scrutinized. Failure to meet academic standards could delay the defense, prompting them to engage in multifaceted revisions (Student M). Lastly, students nearing graduation focused on improving thesis quality through reading and revision using different sources.

There was a question I had when planning to research this topic. Looking at the article, I wondered why the author specifically chose these two concepts, but the reasons weren’t clearly explained … The second paragraph of the AI report (on the adaptation of theoretical frameworks) inspired me. (Student J)

I have read this article many times, but since I am preparing for my thesis proposal defense, now, my focus is still on the research design, introduction, and literature review. (Student M)

Third, the importance of revising research contributions (R2) can be explained in three ways. First-year students aimed to gain teaching insights from pedagogical contributions in GenAI reports, which they had previously overlooked, linking this to their career plans. This shift from academic to pedagogical contributions motivated them to integrate both for their future teaching (Student E). Second-year students focused on solving immediate teaching challenges, such as substitute teaching or internships, where analyzing contributions helped them perform well and pass evaluations. Success in substitute teaching could lead to full-time positions, as experienced by Student L, who sought ways to motivate students during her internship. Lastly, senior students used integrated reading and writing on research contributions to improve their thesis revisions, driven by the external demand to meet academic standards.

I reviewed the AI-provided document, which focused more on the pedagogical aspect … Previously, I focused more on the academic side, but considering my future as a teacher, I’m now more concerned about its pedagogical insights. I believe research should ideally offer practical significance. (Student E)

I focus on the pedagogical aspect of the contributions because I recently started substitute teaching, also an internship, at a middle school. It's my first time teaching large classes, and it made me think about how to stimulate students’ interest. (Student L)

Fourth, it showed that students used interactive reading and revision, including communication with the GenAI chatbot, the GenAI-generated critical reading report, and revisions of their self-written critical reading reports, to engage deeply with the dimension of the quality of evidence (R5). This high engagement generally stemmed from literal misunderstandings, particularly misinterpreting what the quality of evidence entails (e.g., students H misunderstood “the quality of evidence” as referring to the procedural details of a study, such as the number of participants, rather than the more abstract and analytical evaluation required for this dimension). Interviewees reported lacking confidence in analyzing this dimension and, when interacting with materials, noticed gaps between their understanding and the actual requirements (e.g., Student G). They frequently relied on the GenAI chatbot for clarification, indicating that this dimension posed cognitive challenges regardless of their year level. Ultimately, students recognized that multiple-source reading and repeated revisions improved their understanding of evidence quality (Student H).

Regarding the analysis of evidence quality, I thought about finding viewpoints from literature reviews. Honestly, I didn’t have confidence in my answers, and later, when I saw AI's answers, I felt I might have been answering in the wrong direction. (Student G)

I misunderstood the topic. It seems I was discussing how the experiment operates, like how many students were invited and how evaluations were conducted. I saw this as evidence. However, after looking at the AI, I realized its evidence was different from mine. So, I made many revisions. (Student H)

Lastly, we analyzed why students showed lower engagement in revising six dimensions of the critical reading report. Specifically, low engagement in revising main claims (R3) and the generalization of findings (R4) was attributed to a similar understanding of the content, reducing the need for further revision. Limited engagement in revising moral and value preferences (R7), the extent to which claims are supported or challenged by others’ work (R8), and the summary evaluation (R10) was linked to a misalignment between reading and revision purposes, compounded by a lack of external demands from supervisors. The low engagement in revising consistency with the reader's experience (R9) was distinct, as the GenAI-generated report's analysis did not align with students’ career plans (Table 4).

Table 4.

Four Factors Explaining the Low Engagement With Six Dimensions.

Dimensions of Revising Critical Reading Reports	Related Factors	Elaboration
Main claims (R3)	Similar understanding of the content	Students had similar interpretations of the main claims and generalizations with the GenAI report, reducing the need for deeper engagement.
The generalization of findings (R4)	Similar understanding of the content
Moral and value preferences (R7)	Reading purposes External demands	Lower engagement occurred as these dimensions were not emphasized by supervisors (external demands), misaligning with reading purposes.
The extent to which the claims are supported or challenged by others’ work (R8)
The summary evaluation (R10)
The consistency with the reader's experience (R9)	Career plans as pre-service teachers	Students engaged less with this dimension because they had strong confidence in their own experiences and self-positioning as both pre-service teachers and scholars.

Compared to the analysis of the quality of evidence, which might lead to misunderstandings, some students noted that their self-written critical reading reports on main claims and the generalization of findings were similar to the GenAI-generated report, reducing the need for revisions.

The limitations of the “Main Claims” section—didn’t I already have them? I remember I have presented them … My critical analysis of the degree of generalization already mentioned these points, so I didn’t refer to this part of the GenAI-generated report. (Student H)

For dimensions such as moral and value preferences, the extent to which claims are supported or challenged by others’ work, and the summary evaluation, most participants disregarded them because supervisors did not require these in proposals. This misalignment led to lower engagement with these dimensions (R7, R8, and R10) in revisions (Student M).

For instance, critically analyzing the extent to which the claims are supported or challenged by others’ work might not be related to my specific reading purpose, so I didn’t examine it in detail. (Student M)

Regarding the dimension of consistency with the reader's experience, Student V served as a representative case, as she saw herself as both a pre-service teacher and a scholar, while the GenAI report focused solely on the student role. Since Student V had already articulated her dual identity in her own writing, she may not have perceived the GenAI-generated feedback as relevant or necessary for revision. This misalignment likely contributed to the low engagement with this dimension, as participants, with a high level of confidence, did not perceive a need to adjust their self-positioning based on AI's narrower perspective.

Because the experience analyzed by AI is different from mine, I think it considers, perhaps, the students’ perspectives and thoughts. However, I see myself as both a researcher and a teacher. (Student V)

Discussion

The study investigated which dimensions of critical reading and writing received greater student engagement in a GenAI-assisted reading and revision context, and explored the reasons behind this increased engagement. Regarding RQ1, students selectively focused on revising their critical reading reports in four key dimensions: research aims and investigation, research contributions, quality of evidence, and adaptation of theoretical frameworks. For RQ2, four factors, including reading and revision purposes, external demands from supervisors, career plans as pre-service teachers, and literal misunderstandings of the content, were associated with prioritizing these dimensions over others.

Firstly, in response to RQ1, it was revealed that GenAI tools have not been comprehensively utilized to assist students in revising their critical reading reports. Prior research on online collaborative tools (Cui & Wang, 2024; Kohnke & Har, 2022; Koşar, 2023) and multimodal content (Allagui, 2021; Fazio et al., 2022; Salmerón et al., 2020) similarly showed that these technologies were not fully applied to support critical reading and writing across all ten dimensions, with emphasis primarily on the quality of evidence and adaptation of theoretical frameworks. Recent studies using AI chatbots (Cheung et al., 2024; Nguyen et al., 2024; Peng et al., 2022) further confirmed selective engagement in specific dimensions, such as the quality of evidence, research aims and investigation, and research contributions. Our study, one of the first to employ a comprehensive framework, indicated that selective engagement was also evident during the revision of academic critical reading reports with GenAI tools. Most importantly, it revealed that educational technologies, including GenAI, have not fundamentally reshaped students’ engagement or decision-making in critical thinking.

Secondly, students showed relatively low engagement in revising their critical reading reports across six dimensions: main claims, generalization of findings, moral and value preferences, the extent to which claims are supported or challenged by others’ work, consistency with the reader's own experiences, and the summary evaluation. This low engagement can be attributed to several factors, including the misalignment between students’ reading and revision purposes, external demands from supervisors, career plans as pre-service teachers, and the similar understanding between humans and GenAI-generated reports, which likely led students to neglect these dimensions. Previous research on AI-assisted critical reading and writing (Cheung et al., 2024; Nguyen et al., 2024; Peng et al., 2022) similarly suggested that these dimensions did not receive sufficient attention due to participants’ reading and writing purposes. These findings support the view that AI cannot comprehensively address all dimensions of reading (Burriss & Leander, 2024) and writing (Wang, 2022), particularly when dealing with complex moral or societal factors (Böhm et al., 2023), such as inferring the author's moral and value preferences. Moreover, human agency remains central, as students display personal preferences while collaborating with or learning with AI in assisting their higher-order thinking, rather than being replaced by it (Darvishi et al., 2024).

Thirdly, to tackle RQ2, we identified four factors influencing students’ high engagement with GenAI tools in revising certain dimensions of critical reading reports. Three of these factors were also found in previous studies: (1) Reading and writing purposes, which serve as major influences on students’ critical reading (Salmerón et al., 2020), guiding them to engage with dimensions most relevant to their academic goals. For example, Koşar (2023) found that the purpose of assessing a paper's evidence quality significantly enhanced students’ engagement in evaluating the adequacy of supporting details. (2) External demands from teachers, which often direct students’ attention to specific dimensions to meet assessment standards. Cheung et al. (2024) found that teachers explicitly required students to focus on evaluating the quality of evidence in AI-generated texts. (3) Understanding of the content or materials, as students may engage more deeply when they realize misunderstandings and try to correct errors. Additionally, if the material covers unfamiliar content, students may need to invest more effort in learning it. For example, Cui and Wang (2024) found that lecture slides focused on theoretical frameworks and concepts reinforced students’ selective engagement with this dimension.

In addition to these factors, this study identified another factor influencing students’ engagement in critical reading and revision: their career plans as pre-service teachers. Participants consistently evaluated their reading and revision outcomes, emphasizing dimensions that aligned with their strong beliefs about becoming teachers. This finding aligned with earlier views that pre-service teachers’ beliefs significantly impact their learning focus and preferences (Han et al., 2017; Martinez et al., 2024). Participants in this study considered four factors holistically, allowing them to revise their critical reading reports and selectively use GenAI tools to maximize learning outcomes, demonstrating their proficient self-regulation skills, including setting appropriate goals, utilizing self-monitoring and evaluation strategies, which echoes the results of Tam (2024).

Conclusion

This study contributes to the growing literature on GenAI-assisted academic reading and writing by providing a nuanced understanding of the process of GenAI tools in facilitating L2 learners’ engagement with critical reading and writing dimensions during report revision. It highlights the interplay between human agency and GenAI tools in fostering self-regulation strategies that support critical thinking development, emphasizing students’ active role in navigating and optimizing GenAI tools to enhance their analytical, evaluative, and expressive skills.

This study is significant in demonstrating how GenAI can support but not replace learners’ critical reading and writing processes, reinforcing the importance of student agency in AI-assisted learning. By identifying dimensions with greater engagement and those overlooked, it offers insights for educators, AI developers, researchers, and policymakers to design AI-enhanced learning environments that foster deeper critical engagement while maintaining a balance between AI support and learner autonomy.

Four key implications emerged in this research. First, since students selectively engaged in revising critical reading reports of certain dimensions in the GenAI context, AI developers should design tools with features that prompt users to address neglected areas, such as built-in prompts or scaffolding to encourage broader critical thinking. Second, while AI can assist students in revising critical reading reports, it cannot replace human intervention. Teachers must guide students in effectively using AI to develop critical thinking skills. Teacher involvement, alongside GenAI, can better support students by providing scaffolding, ensuring that AI enhances higher-order thinking rather than acting as a standalone solution. Third, future research should explore self-regulation and metacognitive strategies in GenAI-assisted learning, investigating how these strategies interact with AI tools to empower students to make autonomous, informed decisions during critical reading and revision, leading to more personalized and effective AI-supported educational interventions. Fourth, policymakers should establish guidelines that encourage the use of GenAI tools as an aid in developing critical reading and writing skills without mandating their use or positioning AI as the primary agent. Instead, policies should support a framework that promotes student agency, with teachers providing guidance alongside GenAI assistance.

Two limitations remained in the study. First, the sample size of 22 postgraduate students and the relatively short 4-week duration limit the generalizability of the findings, and future studies should include larger samples and longer timeframes. Additionally, this study focused on a specific cohort of students, which may also restrict the applicability of the results, as students with different educational backgrounds may respond differently. Second, the majority of participants were female, which may have influenced the results, though prior research suggested a similar gender preference for English majors in Asian countries (Kobayashi, 2002).

Footnotes

ORCID iDs

Haoming Lin

Wei Wei

Ethical Considerations

The study received ethical approval from the Faculty of Applied Sciences,Macao Polytechnic University (Ethical Approval No. HEA003-FCA-2025).

Author Contributions

Haoming Lin analyzed the data and drafted the manuscript. Wei Wei drafted the research proposal and supervised the whole study. Fulan Liu assisted in data collection and edited the manuscript.

Funding

The authors disclosed receipt of the following financial support for the research,authorship,and/or publication of this article: This research was funded by Macao Polytechnic University Grants (Grant ID: RP/FCA-08/2023).

Declaration of Conflicting Interests

The authors declared no potential conflicts of interest with respect to the research,authorship,and/or publication of this article.

References

Abdelrahim

A. A. M.

(2023). The impact of a critical fiction analysis based on using augmented reality technology on developing students’ critical thinking and critical writing at Tabuk University. Language Teaching Research, Article 13621688231155578. https://doi.org/10.1177/13621688231155578

Allagui

(2021). TED Talk comments to enhance critical thinking skills in an undergraduate reading and writing course. Education and Information Technologies, 26(3), 2941–2960. https://doi.org/10.1007/s10639-020-10388-w

Bakeman

Gottman

J. M.

(1997). Observing interaction: An introduction to sequential analysis (2nd ed.). Cambridge University Press.

Ballenghein

Kaakinen

J. K.

Tissier

Baccino

(2020). Cognitive engagement during reading on digital tablet: Evidence from concurrent recordings of postural and eye movements. Quarterly Journal of Experimental Psychology, 73(11), 1820–1829. https://doi.org/10.1177/1747021820931830

Basaraba

Yovanoff

Alonzo

Tindal

(2013). Examining the structure of reading comprehension: Do literal, inferential, and evaluative comprehension truly exist? Reading and Writing, 26(3), 349–379. https://doi.org/10.1007/s11145-012-9372-9

Böhm

Jörling

Reiter

Fuchs

(2023). People devalue generative AI’s competence but not its advice in addressing societal and personal challenges. Communications Psychology, 1(1), 32. https://doi.org/10.1038/s44271-023-00032-x

Burriss

S. K.

Leander

(2024). Critical posthumanist literacy: Building theory for reading, writing, and living ethically with everyday artificial intelligence. Reading Research Quarterly, 59(4), 560–569. https://doi.org/10.1002/rrq.565

Chen

Y.-C.

Park

Hand

(2016). Examining the use of talk and writing for students’ development of scientific conceptual knowledge through constructing and critiquing arguments. Cognition and Instruction, 34(2), 100–147. https://doi.org/10.1080/07370008.2016.1145120

Chen

Wei

Zhu

Yao

(2025). Unpacking the rejection of L2 students toward ChatGPT-generated feedback: An explanatory research. ECNU Review of Education. https://doi.org/10.1177/20965311241305140

10.

Chen

Zhu

Wei

(2024). L2 students’ barriers in engaging with form and content-focused AI-generated feedback in revising their compositions. Computer Assisted Language Learning, 1–21. https://doi.org/10.1080/09588221.2024.2422478

11.

Cheung

K. K. C.

Pun

J. K. H.

(2024). Students’ holistic reading of socio-scientific texts on climate change in a ChatGPT scenario. Research in Science Education, 54(5), 957–976. https://doi.org/10.1007/s11165-024-10177-2

12.

Cui

Wang

(2024). Empowering active learning: A social annotation tool for improving student engagement. British Journal of Educational Technology, 55(2), 712–730. https://doi.org/10.1111/bjet.13403

13.

Cutumisu

Schwartz

D. L.

(2018). The impact of critical feedback choice on students’ revision, performance, learning, and memory. Computers in Human Behavior, 78, 351–367. https://doi.org/10.1016/j.chb.2017.06.029

14.

Darvishi

Khosravi

Sadiq

Gašević

Siemens

(2024). Impact of AI assistance on student agency. Computers & Education, 210, Article 104967. https://doi.org/10.1016/j.compedu.2023.104967

15.

Darwin

R. D.

Mukminatien

Suryati

Laksmi

E. D.

Marzuki . (2024). Critical thinking in the AI era: An exploration of EFL students’ perceptions, benefits, and limitations. Cogent Education, 11(1), Article 2290342. https://doi.org/10.1080/2331186X.2023.2290342

16.

Dinçer

ZÖ

Seferoğlu

(2020). Factors affecting pre-service English teachers’ career plans in Turkey: Institutions and regions. Journal of Education for Teaching, 46(1), 4–19. https://doi.org/10.1080/02607476.2019.1708622

17.

Facione

(2011). Critical thinking: What it is and why it counts. The California Academic Press.

18.

Fazio

Gallagher

T. L.

DeKlerk

(2022). Exploring adolescents’ critical reading of socioscientific topics using multimodal texts. International Journal of Science and Mathematics Education, 20(1), 93–116. https://doi.org/10.1007/s10763-022-10280-8

19.

Fonteyne

Eelbode

Lanszweert

Roels

Schelfhout

Duyck

De Fruyt

(2018). Career goal engagement following negative feedback: Influence of expectancy-value and perceived feedback accuracy. International Journal for Educational and Vocational Guidance, 18(2), 165–180. https://doi.org/10.1007/s10775-017-9353-2

20.

Han

Shin

W. S.

(2017). The effect of student teaching experience and teacher beliefs on pre-service teachers’ self-efficacy and intention to use technology in teaching. Teachers and Teaching, 23(7), 829–842. https://doi.org/10.1080/13540602.2017.1322057

21.

Holmes

Miao

(2023). Guidance for generative AI in education and research. UNESCO Publishing.

22.

Hospel

Galand

Janosz

(2016). Multidimensionality of behavioural engagement: Empirical support and implications. International Journal of Educational Research, 77, 37–49. https://doi.org/10.1016/j.ijer.2016.02.007

23.

Hsiao

J.-C.

Chang

J. S.

(2023). Enhancing EFL reading and writing through AI-powered tools: Design, implementation, and evaluation of an online course. Interactive Learning Environments, 32(9), 1–16. https://doi.org/10.1080/10494820.2023.2207187

24.

Kim

Detrick

(2024). Exploring students’ perspectives on Generative AI-assisted academic writing. Education and Information Technologies, 30(1), 1265–1300. https://doi.org/10.1007/s10639-024-12878-7

25.

Kobayashi

. (2002). The role of gender in foreign language learning attitudes: Japanese female students' attitudes towards English learning. Gender and Education, 14(2), 181–197. https://doi.org/10.1080/09540250220133021

26.

Kohnke

Har

(2022). Perusall encourages critical engagement with reading texts. RELC Journal, 55(2), 586–595. https://doi.org/10.1177/00336882221112166

27.

Koşar

(2023). Online collaborative learning: Does it improve college students’ critical reading skills? Interactive Learning Environments, 31(8), 5114–5126. https://doi.org/10.1080/10494820.2021.1998137

28.

Lin

Xiong

Tang

Jiang

Wei

Fang

(2024, September 13–15). Implementing generative AI agent game to support reading of classical Chinese literature: A needs analysis. In 2024 4th international conference on educational technology (ICET) (pp. 86–91). Wuhan, China: IEEE. https://doi.org/10.1109/ICET62460.2024.10868296

29.

List

Alexander

P. A.

(2017). Cognitive affective engagement model of multiple source use. Educational Psychologist, 52(3), 182–199. https://doi.org/10.1080/00461520.2017.1329014

30.

List

Lin

C.-J.

(2023). Content and quantity of highlights and annotations predict learning from multiple digital texts. Computers & Education, 199, Article 104791. https://doi.org/10.1016/j.compedu.2023.104791

31.

Liu

Stapleton

(2014). Counterargumentation and the cultivation of critical thinking in argumentative writing: Investigating washback from a high-stakes test. System, 45, 117–128. https://doi.org/10.1016/j.system.2014.05.005

32.

Martinez

M. I.

Díaz Lara

Whitney

C. R.

(2024). The role of teacher beliefs in teacher learning and practice: Implications for meeting the needs of English learners/emergent bilinguals. Language and Education, 39(3), 717–735. https://doi.org/10.1080/09500782.2024.2362305

33.

Nguyen

Hong

Dang

Huang

(2024). Human-AI collaboration patterns in AI-assisted academic writing. Studies in Higher Education, 49(5), 847–864. https://doi.org/10.1080/03075079.2024.2323593

34.

Paul

Elder

(2004). Critical thinking… and the art of close reading, part III. Journal of Developmental Education, 28(1), 36–37. http://www.jstor.org/stable/42775875

35.

Peng

Liu

Zhou

(2022). CRebot: Exploring interactive question prompts for critical paper Reading. International Journal of Human-Computer Studies, 167, 102898. https://doi.org/10.1016/j.ijhcs.2022.102898

36.

Rijlaarsdam

Couzijn

Van Den Bergh

(2004). The study of revision as a writing process and as a learning-to-write process. In Allal

Chanquoy

Largy

(Eds.), Revision cognitive and instructional processes (pp. 189–207). Springer Netherlands.

37.

Robotham

Julian

(2006). Stress and the higher education student: A critical review of the literature. Journal of Further and Higher Education, 30(2), 107–117. https://doi.org/10.1080/03098770600617513

38.

Salmerón

Delgado

Mason

(2020). Using eye-movement modelling examples to improve critical reading of multiple webpages on a conflicting topic. Journal of Computer Assisted Learning, 36(6), 1038–1051. https://doi.org/10.1111/jcal.12458

39.

Tam

A. C. F.

(2024). Interacting with ChatGPT for internal feedback and factors affecting feedback quality. Assessment & Evaluation in Higher Education, 50(2), 219–235. https://doi.org/10.1080/02602938.2024.2374485

40.

Teo

(2014). Making the familiar strange and the strange familiar: A project for teaching critical reading and writing. Language and Education, 28(6), 539–551. https://doi.org/10.1080/09500782.2014.921191

41.

Tsai

M.-J.

A.-H.

Bråten

Wang

C.-Y.

(2022). What do critical reading strategies look like? Eye-tracking and lag sequential analysis reveal attention to data and reasoning when reading conflicting information. Computers & Education, 187, Article 104544. https://doi.org/10.1016/j.compedu.2022.104544

42.

Verdonik

(2010). Between understanding and misunderstanding. Journal of Pragmatics, 42(5), 1364–1379. https://doi.org/10.1016/j.pragma.2009.09.007

43.

Wallace

Wray

(2021). Critical reading and writing for postgraduates. Sage. https://books.google.com/books?id=3UbYzQEACAAJ

44.

Wang

(2022). Computer-assisted EFL writing and evaluations based on artificial intelligence: A case from a college reading and writing course. Library Hi Tech, 40(1), 80–97. https://doi.org/10.1108/LHT-05-2020-0113

45.

Wei

(2023). Artificial intelligence in language instruction: impact on English learning achievement, L2 motivation, and self-regulated learning. Frontiers in Psychology, 14, https://doi.org/10.3389/fpsyg.2023.1261955

46.

Wijaya

T. T.

Cao

Weinhandl

Houghton

(2024). Examining Chinese preservice mathematics teachers’ adoption of AI Chatbots for learning: Unpacking perspectives through the UTAUT2 model. Education and Information Technologies, 30(2), 1387–1415. https://doi.org/10.1007/s10639-024-12837-2

47.

Yang

Wang

Lyu

(2023). Assessing ChatGPT’s educational capabilities and application potential. ECNU Review of Education, 7(3), 699–713. https://doi.org/10.1177/20965311231210006

48.

Zhang

Hyland

(2023). Student engagement with peer feedback in L2 writing: Insights from reflective journaling and revising practices. Assessing Writing, 58, Article 100784. https://doi.org/10.1016/j.asw.2023.100784

GenAI-Assisted Critical Reading Report Revision: A Mixed-Methods Study

Abstract

Purpose

Design/Approach/Methods

Findings

Originality/Value

Keywords

Introduction

Literature Review

The Framework of Critical Reading and Writing

Related Work on Technology-Assisted Critical Reading and Writing

Research Questions

Methodology

Context and Participants

Materials

Academic Papers

Participants’ Self-Written Critical Reading Reports

GenAI Chatbot and the GenAI-Generated Critical Reading Reports

Procedure

Data Sources

Data Analysis

Data Analysis for RQ1

Data Analysis for RQ2

Findings

Quantitative Findings

Qualitative Findings

Discussion

Conclusion

Footnotes

ORCID iDs

Ethical Considerations

Author Contributions

Funding

Declaration of Conflicting Interests

References