Sage Journals: Discover world-class research

Abstract

There is growing consensus that the provision of Written Corrective Feedback (WCF) on second language (L2) writing is beneficial for long-term L2 development. Yet, while WCF may help improve linguistic accuracy, there is no evidence that it promotes the development of linguistic complexity. Further, WCF is also often laborious and time-consuming for the provider. One intervention that may mitigate the above is learner exposure to comparator (model) texts. Studies have demonstrated that the use of comparators can lead to noticing and uptake in immediate revisions. However, no attempt has been made to ascertain the longer-term effect of comparators on the complexity and accuracy of new pieces of writing. Using a pre-, post-, delayed post-test design, this study examined the effects of two interventions on the complexity and accuracy of revisions, and on new pieces of writing. One group received comprehensive WCF on an initial draft and a second received comparators. A third group received neither. All groups revised their draft and produced new pieces of writing. It was found that comparators had a significant effect on the lexical complexity of revisions and on the accuracy of new texts on the same topic but of a different genre. Furthermore, both comparators and WCF had a significant effect on the accuracy of revisions, and WCF led to a significant increase in the use of subordination in revisions.

Keywords

accuracy comparator texts interlanguage extension interlanguage restructuring lexical complexity model texts second language writing subordination syntactic complexity WCF written corrective feedback

I Introduction

Written corrective feedback (WCF), the provision of negative evidence on overt errors (Hanaoka & Izumi, 2012) in second language (L2) writing, is ‘a ubiquitous pedagogical practice’ (Ferris, 2010, p. 198) considered essential and even ‘nonnegotiable’ (Ferris, 2010) by both students (Leki, 1991) and teachers (Evans, Hartshorn & Tuioti, 2010) alike. WCF is used in L2 writing instruction, or Learning-to-Write, contexts to help learners become more effective writers. It is also used in Instructed Second Language Acquisition (ISLA), or Writing-to-Learn, contexts (Manchón, 2011; Williams, 2012), in which the reactive, incidental, and individualized Focus on Form within a communicative framework is thought to promote long-term L2 development. The use of WCF is supported theoretically by both socio-cognitive and socio-cultural traditions of SLA. Noticing (Schmidt, 1994, 2001) of overt gaps between drafts and target-level writing may trigger interlanguage restructuring via reflection on language use, form and meaning mapping, and hypothesis confirmation or rejection (Swain, 1985, 1995). These processes may be enhanced by the extra attentional resources afforded by the written mode, the permanence and visible salience of written text, and the recursive nature of writing (Adams, 2003; Gilabert, Manchón & Vasylets, 2016; Manchón, 2011; Sachs & Polio, 2007; Williams, 2012), which may facilitate Monitoring (Krashen, 1981) or provide opportunity to convert declarative knowledge into procedural knowledge (DeKeyser, 2003; Polio, 2012).

Despite claims that WCF provision is not worthwhile because of its explicit nature (e.g. Truscott, 1996), its incompatibility with developmental readiness (e.g. Pienemann, 1989), and its discouragement of risk-taking (e.g. Truscott, 1996, 1999a, 2007), there is growing empirical evidence that it can have a positive impact on accuracy (for overviews, see Bitchener & Storch, 2016; Kang & Han, 2015). WCF has been found to lead to more accurate revisions of the same text (e.g. Ashwell, 2000; Biber, Nekrasova & Horn, 2011; Fathman & Whalley, 1990; Ferris & Roberts, 2001; Kang & Han, 2015). There is also evidence that it can improve accuracy in new pieces of writing (Ellis, 2016; Kang & Han, 2015) and that it has a long-term learning effect on accuracy development (Bitchener & Storch, 2016). WCF appears to be effective whether it is direct, indirect, and/or metalinguistic (e.g. Ellis et al., 2008).

Nevertheless, WCF has important limitations. First, it is usually focused on linguistic accuracy, and may not aid the development of linguistic complexity, or ‘the extent to which language produced in performing a task is elaborate and varied’ (Ellis, 2003, p. 340). Indeed, it has been suggested that WCF may actually result in simplified writing (Truscott, 1996, 1999b, 2007). Learners may fear making mistakes, or lack sufficient cognitive capacity to attend to interlanguage extension and restructuring simultaneously (Skehan & Foster, 2001, p. 191). Hanaoka & Izumi (2012, p. 332) describe this risk avoidance, when learners ‘stop short of articulating their meaning or form,’ as ‘covert’ errors. A second limitation of WCF is that it is often labour-intensive and time-consuming. This can be particularly so with large cohorts and in online contexts. It can also be challenging for providers when task difficulty exceeds learners’ ability, or when learners have taken insufficient time or care in their writing. A third drawback is that learners without access to an expert provider may not receive WCF at all.

One pedagogical intervention to help mitigate these shortcomings may be the use of comparator (Lynch, 2009), or model, texts written by expert users. These are provided at the same stage of the writing process as WCF, namely following a draft and before a revision, when learners’ interlanguage may be more malleable (Hanaoka & Izumi, 2012, p. 345). Like WCF, comparators provide timely and relevant Comprehensible Input (Krashen, 1981), facilitate cognitive comparison (Ellis, 1995), or gap noticing (Schmidt, 1994), aid interlanguage restructuring, and act as a textually mediated scaffold (Bitchener & Storch, 2016; Storch, 2018). However, unlike WCF, comparators also provide large amounts of target-like positive evidence (Qi & Lapkin, 2001), which may provide the opportunity for hole noticing (Muranoi, 2007, p. 57) of ‘covert’ issues (Hanaoka and Izumi, 2012), when learners have either avoided taking risks, or have written accurately but in perhaps a less sophisticated way. This positive evidence may provide an opportunity for complexity development, or interlanguage extension. Further, this development may also coincide with the learner’s own internal syllabus, foster greater autonomy and encourage deeper cognitive effort (Qi & Lapkin, 2001; Sachs & Polio, 2007). Comparators may also allow the possibility of implicit detection (Tomlin & Villa, 1994) providing opportunity for more implicit learning processes to occur.

From a practical perspective, comparators can provide opportunity for a timely focus on form within a communicative language programme according to the individual learner’s priorities and developmental readiness rather than external intervention (Long & Crookes, 1992). Although they may not always result in the cognitive comparisons required to trigger development, or respond to every individual learner’s feedback needs, comparator exposure may influence both accuracy and complexity development (Long, 2015, p. 28). Comparator use may also help reduce the amount of WCF required saving tutor time and encouraging autonomy. For learners without access to a WCF provider, comparators also offer a partial substitute. Traditionally, model texts tended to be presented at the outset of the pedagogic sequence for imitation purposes. This template approach was rightly criticized for placing teachers’ ideas and ‘teachers’ words into students’ mouths’ (Ferris, 2010, p. 190). However, as was pointed out long ago (Eschholz, 1980; Watson, 1982), the texts can be used for the purposes of comparison rather than imitation when their introduction is delayed until after the production of a draft. Thus, the learner has already invested time and effort into the draft, and established ownership of the writing.

II The present study

Despite the evidence on WCF, much remains unknown. First, most studies have used focused WCF, targeting one or several error categories (e.g. Bitchener, 2008; Bitchener & Knoch, 2008; Bitchener, Young & Cameron, 2005; Ellis et al., 2008; Sheen, 2007). However, fewer have investigated comprehensive WCF, which is common in many teaching contexts (Liu & Brown, 2015), and reflects pedagogical practice in which teachers provide, and learners expect, WCF on all errors in the text (Bruton, 2009; Ferris, 2010; Truscott & Hsu, 2008; Van Beuningen, 2010; Xu, 2009). Notable examples are those by Van Beuningen, De Jong & Kuiken (2008, 2012), Bonilla López, Van Steendam & Buyse (2017), and Karim & Nassaji (2020), which used delayed post-tests and new pieces of writing to establish that comprehensive WCF can be beneficial for long-term accuracy development.

A further issue is that few studies have examined the impact of WCF on linguistic complexity. One that did so is Van Beuningen et al.’s (2008, 2012) study. The authors found that WCF did not detract from writing complexity either lexically or syntactically, findings which refuted Truscott’s (1996) contention that WCF may lead to simplified writing. However, the authors did not report any increase in complexity. A final point is that new pieces of writing used in many post-tests have tended to be of the same genre. This means that although there was opportunity to demonstrate development of genre-specific features, such as past simple forms or phrases, such as once upon a time, in narratives, there was little opportunity for learners to demonstrate development of topical lexis particular to the task in hand (Frear & Chiu, 2015; Van Beuningen et al., 2012).

Comparator use has attracted surprisingly little research attention. Those studies that have been published have been of a different design to those examining the long-term effects of WCF. Typically, they have employed identification of learners’ metalinguistic language related episodes (LREs) via note-taking (e.g. Hanaoka & Izumi, 2012; García Mayo & Labandibar, 2017) or collaborative dialogue (Coyle et al., 2018; Yang & Zhang, 2010) to demonstrate noticing of overt and covert problems. Hanaoka’s (2006, 2007) Japanese first language (L1) learners, particularly the more proficient ones, reported noticing, and were able to self-correct most their overt language problems, mainly lexical, with reference to the comparators. Learners used the comparators to address covert issues that they had avoided in their first draft, a finding which suggests the utility of comparators in addressing Truscott’s (1996) concerns about simplified writing. Similar findings were made in a Spanish secondary school context by Martinez Esteban & Roca de Larios (2010), and in a Chinese university context by Yang & Zhang (2010).

Subsequent studies in primary and secondary school settings have also found that comparator use promoted gap noticing of covert issues, particularly of lexis and subsequent uptake in revisions (Cánovas Guirao, Roca de Larios & Coyle, 2015; Coyle, Cánovas Guirao, & Roca de Larios, 2018; Coyle & Roca de Larios, 2014; García Mayo & Labandibar, 2017). Comparator use has been found to be more effective than reformulation in helping learners overcome covert problems (Hanaoka & Izumi, 2012), and more effective than direct WCF in promoting lexical noticing (Coyle & Roca de Larios, 2014). Interestingly, Coyle et al.’s (2018) study found that learners improved their revisions without having provided metalinguistic evidence of noticing in either their notes or in collaborative dialogue, which suggests that developmental processes may have occurred at different points along the implicit/explicit spectrum.

While these studies clearly illuminate the processes involved when learners use comparators, their design means that they cannot be compared with the WCF studies cited above. They did not attempt to investigate long-term development by including a delayed post-test, or ask their learners to produce a new piece of writing. Moreover, they did not employ the constructs of linguistic accuracy, used in the WCF literature, or linguistic complexity widely used for measuring L2 output in SLA (e.g. Norris & Ortega, 2009). It could further be added that while analyses of metalinguistic LREs provide valuable insights into noticing, they overlook processes, either explicit or implicit, which are not externally evidenced in note form or collaborative dialogue.

III Method

To address ‘practice-based problems’ (McKinley, 2019) around WCF, the relatively little research on the effects of comprehensive WCF on L2 development and the lack of research on the effect of comparators on long-term L2 development, this study set out to investigate the effects of both interventions on revisions and new pieces of writing. Although indirect WCF may be more effective for acquisition because it requires greater cognitive effort (Adams, 2003; Bitchener & Knoch, 2008; Ferris, 1999, 2003; Lalande, 1982; Li, 2010), it was nevertheless decided to use direct WCF. This was to ensure that both WCF and comparator treatment groups had access to target-like language for the required cognitive comparisons (Ellis, 2009; Ferris, 2002). In addition, as little is known about the effects of comparators on: (a) immediate and delayed revisions of the same text; (b) immediate and delayed production of new texts of the same genre; and (c), immediate and delayed production of new texts involving the same topic but of a different genre, this study set out to investigate the following research questions.

1 Research questions

Research question 1: What is the effect of comprehensive direct WCF on the complexity and accuracy of same text revisions (Same Text Revision), new texts of the same genre (New Text Same Genre), and new texts involving the same topic but of a different genre (New Text Same Topic)?

Research question 2: What is the effect of comparators on the complexity and accuracy of same text revisions (Same Text Revision), new texts of the same genre (New Text Same Genre), and new texts involving the same topic but of a different genre (New Text Same Topic)?

2 Participants and setting

The data was obtained from 42 adult participants, enrolled on a full-time pre-sessional English course at a Scottish university. Their course, which included General and Academic English strands, consisted of 20 contact hours a week and was primarily intended to prepare students to enter taught post-graduate programmes. Although, at the outset, 54 participants provided consent to take part in the study, 12 were lost to absence on one of the three data collection days. The participants were in seven existing classes, into which they had been placed by the University’s placement test at the outset of their course, and ranged narrowly in level from approximately low-intermediate CEFR B1 to upper-intermediate B2. They came from Japan (n = 15), South Korea (n = 6), China (n = 6), Saudi Arabia (n = 10), France (n = 2), Belgium (n = 1), and Kazakhstan (n = 2), ranging in age from 18 to 40 year (M = 23.93, SD = 5.497). Thirty-one were female and 11 were male. The data was collected during class time on behalf of the researcher by the seven class teachers. The teachers were provided with a clear procedure and rules for data collection. These were discussed beforehand with the researcher to ensure consistency between groups.

3 Research design

The study had a quasi-experimental design with two treatment groups: Treatment 1 WCF (n = 12) and Treatment 2 comparators (n = 15). A third group (n = 15) received no treatment. The participants’ allocation into their groups was based, for practical reasons, on their intact classes: two classes (6 and 7) received WCF, two (2 and 5) received comparators, and three (1, 3 and 4) received no treatment. This was to ensure that each classroom teacher did not have to manage more than one condition at the same time, and that the three conditions had approximately the same number of participants. The data collection procedure can be seen in Figure 1.

Figure 1.

Data collection procedure.

4 Materials

The participants were asked to write two genres: a picture narrative and a picture description (see Appendix 1 in supplemental material). These were chosen for their short length and their assumed unfamiliarity. The picture narrative, widely used in previous WCF studies, was based on a sequence of six pictures and there were two versions. Version A told the story of a family unable to pay for groceries in a supermarket because the father had left his wallet at an ice cream parlour. Version B was about a young woman who did odd jobs to earn money to buy a mobile phone. The participants were asked to write between 130 and 170 words. The second genre was a picture description. In order to provide opportunity to demonstrate topic-related lexical development, the description was of one picture selected from each of the narrative sequences. Again, there were two versions. Version A was a picture of a family eating ice creams outside an ice cream parlour. Version B was a picture of a young woman cutting the grass in a garden. The participants were asked to write between 60 and 80 words. All prompts and pictures were printed in paper booklets and the participants wrote their texts with a pen in the spaces provided.

5 Treatment

Pre-, post, and delayed post-test designs, which include opportunity for revision, can blend writing instruction and SLA research agendas (Ferris, 2010), thereby informing both. Therefore, data collection included both text revision and the writing of new texts. In order to reduce the potential influence of any differences in difficulty between the two versions, the tasks were counterbalanced. In Week 1, all three groups wrote Narrative 1 (Version A or B) and the Description (Version A or B). Participants who received Version A of the narrative also received Version A of the description (ice cream picture). The same applied with Version B (garden picture). The participants were allowed 30 minutes to complete both tasks and were not permitted to talk to each other or use any aids, such as dictionaries. When the time was up, all materials and student writing was collected by the teacher, who then proceeded with the regular class as normal.

In week 2, the WCF group received their first draft of Narrative 1, but not the Description, complete with comprehensive direct WCF provided by the researcher. The WCF consisted of a line drawn through the error with the target form written above. The participants were instructed to study the WCF for 10 minutes and invited to make notes on a sheet provided (see Appendix 2 in supplemental material). After 10 minutes, the first draft was collected by the teacher. The WCF group, with access to their notes only, then revised Narrative 1 (Same Text Revision), wrote Narrative 2 (New Text Same Genre), and revised the Description (New Text Same Topic). The comparator group received two comparators of their respective narrative written by native-speaker colleagues of the researcher (see Appendix 3 in supplemental material). Two comparators provided a higher volume of relevant target-like positive evidence (Hanaoka & Izumi, 2012) and greater exemplification of the amount of variability permissible in the genre. The participants were instructed to study the comparators for 10 minutes and invited to make notes on a sheet provided (see Appendix 4 in supplemental material). After 10 minutes, the comparators were collected by the teacher. Like the WCF group, the comparator group, with access to their notes only, then revised Narrative 1 (Same Text Revision), wrote Narrative 2 (New Text Same Genre), and revised the Description (New Text Same Topic). The third group received neither WCF nor comparators. This group only revised Narrative 1 (Same Text Revision), wrote Narrative 2 (New Text Same Genre), and revised the Description (New Text Same Topic). All three groups were allowed 50 minutes to complete the three tasks, and again were not permitted to talk to each other or use any aids. The WCF and comparator groups were not permitted access to their feedback or comparators respectively. This was to avoid the possibility of simple copying either of the corrected draft or one of the comparators. Again, when the time was up, all materials, including the notes, and student writing was collected by the teacher, who then proceeded with the regular class as normal. In the case of the no treatment group, as there was no intervention, the regular class was 10 minutes longer. In week 6, all three groups again rewrote the same Narrative 1 (Same Text Revision), Narrative 2 (New Text Same Genre), and the Description (New Text Same Topic) under the same conditions.

6 Analysis

All texts were typed by the researcher and saved as text files. They were analysed for linguistic complexity and accuracy. Following Norris & Ortega’s (2009) recommendation for researchers to take into consideration the multi-dimensionality of complexity, four measures were chosen. Lexical complexity was measured using a square root type token ratio (RTTR) calculated using Kyle & Crossley’s (2015) Tool for the Automatic Analysis of Lexical Sophistication (TAALES, 2015). A square root was used to allow for the short nature of the texts. Three syntactic complexity measures were also used: a coordination measure, a ratio of the number of coordinate phrases per T-unit (CP/T); a subordination measure, a ratio of the number of complex T-units per T-unit (CT/T); and a phrasal measure, a ratio of the number of complex nominals per T-unit (CN/T). The syntactic complexity measures were calculated using Kyle’s (2016) Tool for the Automatic Analysis of Syntactic Sophistication and Complexity (TAASSC, 2016), based on the L2 Syntactic Complexity Analyser (Lu, 2010). Like TAALES, TAASSC is both free to download and widely used for its reliability (Lu & Ai, 2015, p. 22) and its strong correlation with human rater scores using performance descriptors (e.g. Yang, Lu & Weigle, 2015). While four measures were chosen to reflect the multidimensionality of complexity, for accuracy only one measure was used. Following previous studies (Van Beuningen et al., 2012; Karim & Nassaji, 2020), this was an error ratio, again chosen to reflect the short nature of the texts. The number of errors was counted, divided by the total number of words, and multiplied by one hundred. The errors were counted manually by the researcher using a protocol based on that used by Van Beuningen et al. (2012) (see Appendix 5 in supplemental material). In order to minimize subjectivity, a policy of only counting ‘absolute’ errors (e.g. a mobile phones) rather than merely ‘dispreferred’ errors (e.g. a handy phone) was adopted (Ellis & Barkhuizen, 2005, p. 59). Furthermore, as only an overall accuracy figure was required to answer the research questions, unlike previous studies, errors were not categorized. However, to ensure reliability, 20% of the data was examined by a second rater. Both raters were English L1 speakers with degrees in TESOL. The interrater reliability was calculated and the Pearson correlation coefficients for Narrative 1 were .980, .946, and .997, those for Narrative 2 were .997 and .996, and those for the Description were .950, .938, and .971.

7 Statistical analysis

Given the small sample sizes, Normality of Distribution (Shapiro–Wilk Test) was often not obtained for the complexity or accuracy measures at the three data collection points (see Appendix 6 in supplemental material). Therefore, non-parametric Friedman tests were used to look for any significant changes from Week 1 to Week 6. If significance was indicated, post-hoc Wilcoxon Signed Rank tests with Bonferroni adjustments (p = .025) were used identify whether the significant changes occurred from Week 1 to Week 2, from Week 2 to Week 6, or both. A consequence of the use of non-parametric tests was that it was not possible to make direct comparisons between the three groups.

For the four complexity measures and the accuracy measure, Kruskal–Wallis tests performed in Week 1 showed that, with one exception, the complexity measure for coordination in the Description, there were no statistical differences between the three groups at the outset of the study (see Appendix 7 in supplemental material).

IV Results

1 WCF

Research question 1 concerned the effects of comprehensive direct WCF on the complexity and accuracy of same text revisions (Same Text Revision), new texts of the same genre (New Text Same Genre), and new texts involving the same topic but of a different genre (New Text Same Topic). Descriptive statistics relating to these can be seen in Table 1.

Table 1.

Effects of Written Corrective Feedback (WCF) on complexity and accuracy of Same Text Revision, New Text Same Genre, and New Text Same Topic (n = 12).

	Week 1		Week 2		Week 6
	M (SD)	95% CI	M (SD)	95% CI	M (SD)	95% CI
Same Text Revision:
Lexical complexity	6.49 (0.85)	[5.94, 7.03]	6.35 (0.69)	[5.92, 6.79]	6.35 (0.70)	[5.90, 6.79]
Coordination	0.37 (0.36)	[0.14, 0.60]	0.34 (0.22)	[0.20, 0.47]	0.31 (0.15)	[0.22, 0.41]
Subordination	0.35 (0.17)	[0.25, 0.46]	0.45 (0.19)	[0.33, 0.57]	0.46 (0.22)	[0.32, 0.60]
Phrasal complexity	1.17 (0.44)	[0.89, 1.45]	1.22 (0.52)	[0.89, 1.55]	1.30 (0.54)	[0.96, 1.64]
Accuracy	1.10 (0.59)	[0.72, 1.47]	0.64 (0.42)	[0.37, 0.91]	1.01 (0.52)	[0.68, 1.34]
New Text Same Genre:
Lexical complexity	6.49 (0.85)	[5.94, 7.03]	6.15 (0.59)	[5.77, 6.52]	6.21 (0.59)	[5.83, 6.59]
Coordination	0.37 (0.36)	[0.14, 0.60]	0.32 (0.21)	[0.18, 0.45]	0.40 (0.24)	[0.25, 0.55]
Subordination	0.35 (0.17)	[0.25, 0.46]	0.41 (0.25)	[0.25, 0.56]	0.38 (0.19)	[0.26, 0.51]
Phrasal complexity	1.17 (0.44)	[0.89, 1.45]	1.15 (0.56)	[0.79, 1.50]	1.10 (0.33)	[0.89, 1.31]
Accuracy	1.10 (0.59)	[0.72, 1.47]	0.96 (0.58)	[0.59, 1.33]	0.84 (0.56)	[0.49, 1.20]
New Text Same Topic:
Lexical complexity	5.18 (0.53)	[4.84, 5.52]	5.27 (0.48)	[4.96, 5.57]	5.49 (0.64)	[5.08, 5.90]
Coordination	0.46 (0.35)	[0.23, 0.68]	0.33 (0.28)	[0.16, 0.51]	0.58 (0.44)	[0.31, 0.86]
Subordination	0.34 (0.29)	[0.16, 0.52]	0.38 (0.31)	[0.19, 0.58]	0.34 (0.28)	[0.17, 0.52]
Phrasal complexity	1.37 (0.48)	[1.06, 1.68]	1.40 (0.45)	[1.11, 1.69]	1.60 (0.86)	[1.06, 2.15]
Accuracy	1.26 (0.67)	[0.83, 1.69]	0.87 (0.46)	[0.57, 1.16]	0.97 (0.61)	[0.58, 1.36]

For Same Text Revision, there were decreases in lexical complexity and coordination. Friedman tests showed that neither of these changes were significant. In contrast, there were increases in subordination and phrasal complexity. For subordination, Friedman tests revealed a significant change from Week 1 to Week 6 (χ² (2, 12) = 6.727, p = .035). Post-hoc Wilcoxon Signed Rank tests with Bonferroni adjustments (p = .025) showed a significant change from Week 1 to Week 2 with a large effect size (z = ‒2.578, p = .010, r = .526). For phrasal complexity, the increase was not significant. With accuracy, there was a clear increase in Week 2 falling back somewhat in Week 6. Friedman tests revealed significant changes from Week 1 to Week 6 (χ² (2, 12) = 11.167, p = .004). Post-hoc Wilcoxon Signed Rank tests with Bonferroni adjustments (p = .025) showed significant changes and large effect sizes from Week 1 to Week 2 (z = ‒2.824, p = .005, r = .576) and from Week 2 to Week 6 (z = ‒2.667, p = .008, r = .544).

For New Text Same Genre, there was a marked decline in the lexical complexity in Week 2 followed by a slight recovery in Week 6. For coordination, the writing became less complex in Week 2 before rising in Week 6. For subordination, complexity rose in Week 2 before falling in Week 6. For phrasal complexity, the writing became more complex in Week 2 and again in Week 6. Friedman tests revealed that none of these changes was significant. As for accuracy, there was improvement from Week 1 to Week 2, and from Week 2 to Week 6. Again, Friedman tests revealed that this was not significant.

For New Text Same Topic, lexical complexity rose particularly from Week 2 to Week 6. For coordination, the writing was less complex from Week 1 to Week 2 before rising to Week 6, For subordination, complexity rose from Week 1 to Week 2 before falling in Week 6. Phrasal complexity increased from Week 1 to Week 2, and increased further to Week 6. Accuracy improved from Week 1 to Week 2 before falling in Week 6. Friedman tests showed that none of these changes was significant.

2 Comparators

Research question 2 concerned the effects of comparators on the complexity and accuracy of same text revisions (Same Text Revision), new texts of the same genre (New Text Same Genre), and new texts involving the same topic but of a different genre (New Text Same Topic). Descriptive statistics relating to these can be seen in Table 2.

Table 2.

Effects of comparators on complexity and accuracy of Same Text Revision, New Text Same Genre, and New Text Same Topic (n = 15).

	Week 1		Week 2		Week 6
	M (SD)	95% CI	M (SD)	95% CI	M (SD)	95% CI
Same Test Revision:
Lexical complexity	5.97 (0.57)	[5.65, 6.29]	6.42 (0.49)	[6.15, 6.70]	6.18 (0.42)	[5.94, 6.41]
Coordination	0.25 (0.16)	[0.16, 0.34]	0.29 (0.17)	[0.19, 0.38]	0.26 (0.16)	[0.17, 0.35]
Subordination	0.39 (0.20)	[0.27, 0.50]	0.46 (0.15)	[0.38, 0.54]	0.33 (0.13)	[0.26, 0.41]
Phrasal complexity	1.05 (0.65)	[0.70, 1.41]	0.85 (0.41)	[0.62, 1.08]	0.88 (0.40)	[0.66, 1.11]
Accuracy	1.32 (0.58)	[1.00, 1.64]	0.92 (0.53)	[0.62, 1.21]	1.08 (0.61)	[0.74, 1.42]
New Text Same Genre:
Lexical complexity	5.97 (0.57)	[5.65, 6.29]	6.04 (0.58)	[5.72, 6.36]	5.88 (0.52)	[5.59, 6.16]
Coordination	0.25 (0.16)	[0.16, 0.34]	0.22 (0.18)	[0.19, 0.31]	0.20 (0.14)	[0.12, 0.27]
Subordination	0.39 (0.20)	[0.27, 0.50]	0.36 (0.16)	[0.27, 0.45]	0.29 (0.17)	[0.20, 0.39]
Phrasal complexity	1.05 (0.65)	[0.70, 1.41]	0.88 (0.42)	[0.64, 1.11]	0.94 (0.35)	[0.75, 1.14]
Accuracy	1.32 (0.58)	[1.00, 1.64]	1.14 (0.54)	[0.84, 1.44]	1.17 (0.72)	[0.77, 1.57]
New Text Same Topic:
Lexical complexity	5.41 (0.68)	[5.04, 5.79]	5.38 (0.59)	[5.06, 5.70]	5.10 (0.50)	[4.83, 5.38]
Coordination	0.19 (0.17)	[0.10, 0.29]	0.24 (0.17)	[0.14, 0.33]	0.27 (0.24)	[0.14, 0.40]
Subordination	0.37 (0.25)	[0.23, 0.50]	0.40 (0.31)	[0.23, 0.58]	0.40 (0.21)	[0.28, 0.51]
Phrasal complexity	1.05 (0.52)	[0.77, 1.34]	1.11 (0.74)	[0.70, 1.52]	1.19 (0.73)	[0.78, 1.60]
Accuracy	1.44 (0.61)	[1.10, 1.78]	1.11 (0.47)	[0.85, 1.37]	1.28 (0.64)	[0.93, 1.63]

For Same Text Revision, there was a clear increase in the comparator group’s lexical complexity from Week 1 to Week 2 falling back in Week 6. Friedman tests conducted for each condition revealed a significant change from Week 1 to Week 6 (χ² (2, 15) = 8.533, p = .014). Post-hoc Wilcoxon Signed Rank tests with Bonferroni adjustments were conducted for the two comparisons (p = .025) showing a significant change from Week 1 to Week 2 with a large effect size (z = ‒2.840, p = .005, r = .518). For coordination and subordination, the writing became more complex in Week 2 before decreasing somewhat. For phrasal complexity, the writing became less complex in Week 2. Friedman tests revealed no significant changes for any of the syntactic complexity measures. With regard to accuracy, there was a clear increase in Week 2 falling back somewhat in Week 6. A Friedman test revealed significant changes from Week 1 to Week 6 (χ² (2, 15) = 13.733, p = .001). Post-hoc Wilcoxon Signed Rank tests with Bonferroni adjustments conducted for the two comparisons (p = .025) showed a significant change from Week 1 to Week 2 with a large effect size (z = ‒3.294, p = .001, r = .601).

For New Text Same Genre, there was a slight rise in lexical complexity followed by a fall in Week 6. For syntactic complexity, there was a fall from Week 1 to Week 2 in all three measures followed by further falls for coordination and subordination. For accuracy, there was an increase from Week 1 to Week 2, which fell slightly in Week 6. Freedman tests conducted on all five measures revealed that none of the changes was significant.

For New Text Same Topic, there was a clear decrease in lexical complexity from Week 1 to Week 2 falling further in Week 6. For coordination, the writing became more complex in Week 2 and again in Week 6. For subordination, complexity rose from Week 1 to Week 2 before falling somewhat in Week 6. For phrasal complexity, the writing increased in complexity from Week 1 to Week 2, and increased further to Week 6. Friedman tests conducted for each complexity measure revealed that none of these changes was significant. With regard to accuracy, there was improvement from Week 1 to Week 2 before a fall in Week 6. A Friedman test revealed a significant change from Week 1 to Week 6: (χ² (2, 15) = 8.933, p = .011). Post-hoc Wilcoxon Signed Rank tests with Bonferroni adjustments conducted for the two comparisons (p = .025) showed a significant change from Week 1 to Week 2 with a medium effect size (z = ‒2.613, p = .009, r = .477).

3 No intervention

Descriptive statistics for the group that did not receive WCF or comparators can be seen in Table 3. For Same Text Revision, there were improvements in lexical complexity, coordination, subordination and phrasal complexity from Week 1 to Week 2. With the exception of phrasal complexity, these fell back in Week 6. There was also improvement in accuracy from Week 1 to Week 2, and again to Week 6. Friedman tests for all measures revealed that none of these changes was significant.

Table 3.

Effects of no intervention on complexity and accuracy of Same Text Revision, New Text Same Genre, and New Text Same Topic (n = 15).

	Week 1		Week 2		Week 6
	M (SD)	95% CI	M (SD)	95% CI	M (SD)	95% CI
Same Text Revision:
Lexical complexity	6.04 (0.57)	[5.73, 6.36]	6.11 (0.62)	[5.77, 6.46]	6.07 (0.70)	[5.68, 6.46]
Coordination	0.22 (0.12)	[0.16, 0.29]	0.24 (0.12)	[0.17, 0.31]	0.22 (0.19)	[0.11, 0.32]
Subordination	0.32 (0.15)	[0.24, 0.41]	0.40 (0.24)	[0.26, 0.53]	0.30 (0.16)	[0.21, 0.39]
Phrasal complexity	0.77 (0.40)	[0.55, 0.99]	0.83 (0.41)	[0.60, 1.06]	0.88 (0.80)	[0.44, 1.32]
Accuracy	1.71 (1.27)	[1.00, 2.41]	1.60 (1.07)	[1.01, 2.19]	1.44 (1.01)	[0.88, 2.00]
New Text Same Genre:
Lexical complexity	6.04 (0.57)	[5.73, 6.36]	5.89 (0.59)	[5.57, 6.21]	6.02 (0.77)	[5.60, 6.45]
Coordination	0.22 (0.12)	[0.16, 0.29]	0.26 (0.21)	[0.14, 0.37]	0.21 (0.18)	[0.11, 0.31]
Subordination	0.32 (0.15)	[0.24, 0.41]	0.36 (0.19)	[0.25, 0.47]	0.37 (0.13)	[0.29, 0.44]
Phrasal complexity	0.77 (0.40)	[0.55, 0.99]	0.88 (0.61)	[0.54, 1.22]	0.75 (0.41)	[0.52, 0.98]
Accuracy	1.71 (1.27)	[1.00, 2.41]	1.41 (1.02)	[0.84, 1.97]	1.46 (1.16)	[0.82, 2.11]
New Text Same Topic:
Lexical complexity	5.26 (0.64)	[4.90, 5.61]	5.08 (0.75)	[4.67, 5.50]	5.40 (0.64)	[5.04, 5.75]
Coordination	0.31 (0.20)	[0.19, 0.42]	0.26 (0.20)	[0.15, 0.37]	0.34 (0.28)	[0.18, 0.50]
Subordination	0.30 (0.15)	[0.21, 0.38]	0.30 (0.20)	[0.19, 0.41]	0.34 (0.26)	[0.20, 0.49]
Phrasal complexity	1.02 (0.80)	[0.58, 1.46]	0.69 (0.44)	[0.45, 0.94]	0.86 (0.50)	[0.58, 1.13]
Accuracy	1.76 (0.99)	[1.21, 2.31]	1.67 (1.17)	[1.02, 2.32]	1.60 (1.07)	[1.01, 2.19]

For New Text Same Genre, there were improvements in all three syntactic complexity measures and in accuracy from Week 1 to Week 2. With the exception of subordination, these fell back in Week 6. There was a slight fall in lexical complexity. Friedman tests for all measures showed that none of these changes was significant.

For New Text Same Topic, there were reductions in lexical complexity, coordination, phrasal complexity, and increases in subordination and accuracy from Week 1 to Week 2. These were followed by subsequent increases in all complexity and accuracy measures from Week 2 to Week 6. Friedman tests showed that none of these changes was significant.

V Discussion

1 WCF

This study investigated the effects of comprehensive WCF and exposure to comparators on the complexity and accuracy of revisions, new texts of the same genre, and new texts involving the same topic but of a different genre. As expected, comprehensive WCF had a significant effect on the accuracy of revisions of the same text. This finding supports previous findings (Van Beuningen et al., 2008, 2012; Bonilla López et al., 2017; Karim & Nassaji, 2020), and provides further evidence of the benefits of comprehensive WCF on language development. Unexpectedly, comprehensive WCF also lead to a significant increase in the syntactic complexity of revisions as measured by subordination. This study may be the first to report such a finding, which refutes Truscott’s (1996, 1999b, 2007) contention that WCF results in simplified writing. A closer look at the revisions reveals that the increase in subordination manifested itself in different ways. There was, for example, increased use of defining and non-defining relative clauses (e.g. the money she received; sweets which could damage their health), infinitives of purpose (e.g. to earn money), and time phrases (e.g. when they try to pay for goods). It was also noted that following WCF, learners used fewer T-units in their revisions. A possible explanation for this may be the fact that the WCF group was the only group to re-read its first draft. With memories refreshed and equipped with WCF, these learners may therefore have been able to devote more cognitive attention (Skehan, 2001) to increasing both accuracy and syntactic complexity. By contrast, the comparator group, without access to the previous draft and presented with two new target-like examples, may have faced the challenge of having to reconceptualize their writing and thereby been less able to devote attention to increasing syntactic complexity. The no treatment group may have faced a similar challenge of reconceptualization.

WCF had no significant effect on lexical complexity, or on the coordination or phrasal dimensions of syntactic complexity. It also had no significant effect of any kind on new texts, whether of the same genre or involving the same topic.

2 Comparators

Exposure to comparator texts had a positive influence in two ways. First, it had a significant effect on the lexical complexity of revisions of the same text. This finding is consistent with those of previous studies on comparators (e.g. Cánovas Guirao et al., 2015; Coyle et al., 2018; Hanaoka, 2007), which reported noticing of lexical items at the comparison stage. It suggests that an advantage of comparators may be their potential to supply the necessary covert lexical feedback to enable learners to notice holes, and thereby develop the range and sophistication of their vocabularies.

A closer look at changes made by individual learners following exposure to the comparators suggests different trajectories this development may take. One learner, for example, incorporated input from the comparators, absent from her first draft, into her immediate revision. This included both error-free words and phrases (e.g. an ice cream parlour; irritated; she had just eaten an ice cream; she remembered that; a shop assistant; luckily; rushed back to the supermarket; relieved) and non-error-free words and phrases (e.g. On one Summer day; a long queue at the check out). Some of these were also evident in the delayed revision in Week 6 (e.g. she had just eaten an ice cream; irritated; a shop assistant; rushed; relieved), one had become more accurate still (One summer day) and one had become Americanized (parlor). Interestingly too, the same learner wrote a different word and phrase in the delayed revision for concepts she had also articulated in the immediate revision (Fortunately, a long line), suggesting that the comparators may have also provided a stimulus to activate previously acquired lexis.

No significant improvement in lexical complexity after exposure to the comparators was seen when the learners wrote new texts involving the same topic but of a different genre. Nevertheless, a closer look at the texts suggests that transfer of new lexis from the comparators to the new genre did occur. For example, language from the comparators absent in the first picture description, was written in the immediate picture description revision in Week 2 both accurately (e.g. an ice cream parlour; a shop assistant) and inaccurately (e.g. a sandae). Some of this language was still evident in the delayed revision in Week 6 (e.g. a parlor, a shop assistant), which suggests that transfer of lexical items to new writing can occur. However, there was no evidence of lexical development in new texts of the same genre. This result is perhaps easier to explain given that the new topic did not provide sufficient opportunity for the learners to demonstrate uptake of topic-related lexis. Overall, though, with regard to lexical complexity, simple exposure to comparators may well be beneficial, perhaps as a result of the perceptual salience and communicative functionality of lexis. However, for it to be more effective, sustained and repeated interventions may be required.

The second area on which the comparators had a significant effect was accuracy. This unexpected result occurred in same text revisions and in new texts involving the same topic. A closer look at the revisions reveals incidences of self-correction following exposure to the comparators (e.g. icecream became ice cream, and casher became shop assistant) some of which was maintained in the delayed revision. However, much of the improvement in accuracy appeared to be the result of avoidance of the inaccurate language used in the first draft and its replacement by accurate use of language taken from the comparators. For example, in Week 1 one learner wrote ‘the mother praised her daughter with holding her shoulder, and her son seemed exciting.’ Following the comparison in Week 2, she did not rearticulate this concept and instead wrote ‘there was a long queue at the check out but everyone seemed to be relieved’, a concept articulated in one of the comparators. This suggests that the comparators had an influence not only on the formulation and articulation of the revisions, but also on their conceptualization. This was particularly apparent when some learners even adopted the names of characters used in the comparators. Notable too were persistent syntactic errors, such as with determiners (e.g. a, the), which suggests that explicit intervention rather than mere exposure to comparators may be required to facilitate noticing of non-salient errors.

A closer look at the new texts involving the same topic but of a different genre texts particularly from Week 1 to Week 2 shows evidence of self-correction following comparison (e.g. at her garden became in her garden). There are also examples of accurate use of language contained in the comparators, which was not used in the first picture description draft (e.g. jobs around the house; mow the lawn; cut the grass). However, the comparators had no significant effect on the accuracy of new texts of the same genre but of a different topic. This further suggests that the benefits of comparators are primarily lexical and may therefore not extend to the writing of texts which differ topically even if they are of the same genre. Finally, the comparators had no significant effect on syntactic complexity. While the idiomaticity of lexis may lend itself to immediate uptake, and while its salience and communicative functionality may facilitate its acquisition (Ellis, 2016, p. 412), the same may not always be true for grammar. It may be therefore that the development of syntactic complexity requires a more interventionist pedagogical approach in order to encourage a higher quality of noticing, or noticing with understanding (Qi & Lapkin, 2001).

3 Limitations

There are of course several limitations to the study and the results must be considered with some caution. First, the small sample size did not permit normal distribution of the data and the use of parametric Repeated Measures ANOVAs. It was therefore not possible to make direct comparisons of the relative effects of WCF and comparators. Further research in the present context with its small class sizes would need to increase sample size by collecting data from consecutive cohorts or by comparing only one condition (WCF or comparators) with a control group. Second, the brief, ‘one-shot’ 10-minute intervention may have been insufficient to have much effect especially in the long-term and on new pieces of writing. Despite the possibility of other variables coming into play (Liu & Brown, 2015), sustained and repeated interventions are probably required in order to obtain a cumulative effect over longer time scales (Bruton, 2010; Ferris, 2004; Kang and Han, 2015; Karim & Nassaji, 2020; Polio, 2012; Truscott, 1999b; Xu, 2009). This accumulated feedback, or ‘massed’ practice (Bygate, 2001), would also have added ecological validity as it would reflect the repeated use of WCF (and comparators) in many real contexts. Another limitation is that the variables of motivation (Liu & Brown, 2015) and proficiency (Cánovas Guirao et al., 2015; Nassaji, 2010; Kang & Han, 2015) were not controlled for. Some studies of comparators, for example, have reported greater noticing at higher levels (García Mayo & Labandibar, 2017; Hanaoka, 2007) and more proficient learners may exhibit a ceiling effect (e.g. Bruton, 2009). However, the complexity and accuracy results for the three conditions at the first data collection point in Week 1 were checked for similarity. The non-significance of group differences suggests that the three groups were approximately comparable. Finally, there was no pre-test collection of data from Narrative 2, which would have provided a more reliable benchmark against which changes in subsequent Narrative 2 drafts could have been measured.

VI Conclusions and implications for research and practice

These results suggest that the use of comparators can increase the lexical complexity and accuracy of revisions, and increase the accuracy of new texts involving the same topic. They also suggest that WCF can aid the development of the subordination dimension of syntactic complexity. Further studies with larger sample sizes permitting the use of more powerful parametric tests and research designs including iterative interventions are required to confirm these findings. The results also suggest that mere exposure to comparators is unlikely to promote the development of syntactic complexity, which is perhaps not always as salient or communicatively functional as lexis, and may consequently often pass beneath the learners’ radar. Therefore, it would also be worth examining the use of comparators combined with other interventions, such as enhanced input, to help raise consciousness of syntactic features. In the classroom, this would mean supplementing the comparators with interventions to promote guided noticing (Yang & Zhang, 2010), particularly at lower levels of proficiency (García Mayo & Labandibar, 2017), and training in their use (Yang & Zhang, 2010). It would also be important to research the effects of comparators on more cognitively demanding writing tasks requiring greater conceptualization. Exposure to comparators, for example, following a draft of a genre requiring greater criticality, such as an argumentative essay, may risk encouraging learners to reconceptualize their production, with the danger of placing ‘teachers’ words into students’ mouths’. A final area for future research would be investigations into learner and teacher attitudes towards the two interventions, and qualitative studies on learners’ use of comparators in order to shed light on the processes they elicit (Bruton, 2009; Han & Hyland, 2015; Liu & Brown, 2015; Storch & Wigglesworth, 2010).

Pending further research, it is likely that comparators have a useful role to play in ISLA. Their potential to facilitate the development of lexical complexity and accuracy suggests that they would be a welcome addition to the pedagogical toolkit for both teachers and course developers. It is suggested here that both comparators and WCF are complementary means of providing a Focus on Form within a task-based cycle. They can be used in tandem, with comparison used in a first iteration followed by WCF in a second. Such a cycle, which has also been proposed for spoken tasks (e.g. Lynch, 2009) and echoes Skehan’s (2001) trade-off hypothesis, may permit a greater focus on meaning and interlanguage extension in the first iteration followed by a greater focus on form and accuracy in the second. Such a procedure, which may not be widely employed in ISLA today, might provide reassurance to stakeholders that a task-based Writing to Learn approach to second language development does include a comprehensive Focus on Form which targets both interlanguage restructuring and extension, and thereby promotes the development of complexity and accuracy. The procedure may also encourage learner autonomy and reduce the feedback workload of hardworking teachers.

Supplemental Material

sj-doc-6-ltr-10.1177_13621688221127643 – Supplemental material for The effects of direct written corrective feedback and comparator texts on the complexity and accuracy of revisions and new pieces of writing

Supplemental material, sj-doc-6-ltr-10.1177_13621688221127643 for The effects of direct written corrective feedback and comparator texts on the complexity and accuracy of revisions and new pieces of writing by Douglas Hamano-Bunce in Language Teaching Research

Supplemental Material

sj-docx-1-ltr-10.1177_13621688221127643 – Supplemental material for The effects of direct written corrective feedback and comparator texts on the complexity and accuracy of revisions and new pieces of writing

Supplemental material, sj-docx-1-ltr-10.1177_13621688221127643 for The effects of direct written corrective feedback and comparator texts on the complexity and accuracy of revisions and new pieces of writing by Douglas Hamano-Bunce in Language Teaching Research

Supplemental Material

sj-docx-2-ltr-10.1177_13621688221127643 – Supplemental material for The effects of direct written corrective feedback and comparator texts on the complexity and accuracy of revisions and new pieces of writing

Supplemental material, sj-docx-2-ltr-10.1177_13621688221127643 for The effects of direct written corrective feedback and comparator texts on the complexity and accuracy of revisions and new pieces of writing by Douglas Hamano-Bunce in Language Teaching Research

Supplemental Material

sj-docx-3-ltr-10.1177_13621688221127643 – Supplemental material for The effects of direct written corrective feedback and comparator texts on the complexity and accuracy of revisions and new pieces of writing

Supplemental material, sj-docx-3-ltr-10.1177_13621688221127643 for The effects of direct written corrective feedback and comparator texts on the complexity and accuracy of revisions and new pieces of writing by Douglas Hamano-Bunce in Language Teaching Research

Supplemental Material

sj-docx-4-ltr-10.1177_13621688221127643 – Supplemental material for The effects of direct written corrective feedback and comparator texts on the complexity and accuracy of revisions and new pieces of writing

Supplemental material, sj-docx-4-ltr-10.1177_13621688221127643 for The effects of direct written corrective feedback and comparator texts on the complexity and accuracy of revisions and new pieces of writing by Douglas Hamano-Bunce in Language Teaching Research

Supplemental Material

sj-docx-5-ltr-10.1177_13621688221127643 – Supplemental material for The effects of direct written corrective feedback and comparator texts on the complexity and accuracy of revisions and new pieces of writing

Supplemental material, sj-docx-5-ltr-10.1177_13621688221127643 for The effects of direct written corrective feedback and comparator texts on the complexity and accuracy of revisions and new pieces of writing by Douglas Hamano-Bunce in Language Teaching Research

Supplemental Material

sj-docx-7-ltr-10.1177_13621688221127643 – Supplemental material for The effects of direct written corrective feedback and comparator texts on the complexity and accuracy of revisions and new pieces of writing

Supplemental material, sj-docx-7-ltr-10.1177_13621688221127643 for The effects of direct written corrective feedback and comparator texts on the complexity and accuracy of revisions and new pieces of writing by Douglas Hamano-Bunce in Language Teaching Research

Footnotes

Funding

The author received no financial support for the research,authorship,and/or publication of this article.

ORCID iD

Douglas Hamano-Bunce

Supplemental material

Supplemental material for this article is available online.

References

Adams

(2003). L2 output, reformulation and noticing: implications of IL development. Language Teaching Research, 7, 347–376.

Ashwell

(2000). Patterns of teacher response to student writing in a multiple-draft composition classroom: Is content feedback followed by form feedback the best method? Journal of Second Language Writing, 9, 227–258.

Biber

Nekrasova

Horn

(2011). The effectiveness of feedback for L1-English and L2-writing development: A meta-analysis. ETS Research Report (RR-11-05). Educational Testing Services.

Bitchener

(2008). Evidence in support of written corrective feedback. Journal of Second Language Writing, 17, 102–118.

Bitchener

Knoch

(2008). The value of written corrective feedback for migrant and international students. Language Teaching Research, 12, 409–431.

Bitchener

Storch

(2016). Written corrective feedback for L2 development. Multilingual Matters.

Bitchener

Young

Cameron

(2005). The effect of different types of corrective feedback on ESL student writing. Journal of Second Language Writing, 9, 227–258.

Bonilla López

Van Steendam

Buyse

. (2017). Comprehensive corrective feedback on low and high proficiency writers: Examining attitudes and preferences. ITL – International Journal of Applied Linguistics, 168, 91–128.

Bruton

(2009). Designing research into the effects of grammar correction in L2 writing: Not so straightforward. Journal of Second Language Writing, 18, 136–140.

10.

Bruton

(2010). Another reply to Truscott on error correction: Improved situation designs over statistics. System, 38, 491–498.

11.

Bygate

(2001). Effects of task repetition: Appraising the developing language of learners. In Bygate

Skehan

Swain

(Eds.), Researching pedagogic tasks, second language learning, teaching and testing (pp. 23–48). Harlow: Longman.

12.

Cánovas Guirao

Roca de Larios

Coyle

(2015). The use of models as a written feedback technique with young EFL learners. System, 52, 63–77.

13.

Coyle

Roca

Larios

(2014). Exploring the role played by error correction and models on children’s reported noticing and output production in a L2 writing task. Studies in Second Language Acquisition, 36, 451–485.

14.

Coyle

Cánovas Guirao

Roca de Larios

(2018). Identifying the trajectories of Young EFL learners across multi-stage writing and feedback processing tasks with model texts. Journal of Second Language Writing, 42, 25–43.

15.

DeKeyser

R.M.

(2003). Implicit and explicit learning. In Doughty

C.J.

Long

M.H.

(Eds.), Handbook of second language acquisition (pp. 313–348). Blackwell.

16.

Ellis

(1995). Interpretation tasks for grammar teaching. TESOL Quarterly, 29, 87–106.

17.

Ellis

(2003). Task-based language learning and teaching. Oxford University Press.

18.

Ellis

(2009). A typology of written corrective feedback types. ELT Journal, 63, 97–107.

19.

Ellis

(2016). Focus on form: A critical review. Language Teaching Research, 20, 405–428.

20.

Ellis

Barkhuizen

(2005). Analysing learner language. Oxford University Press.

21.

Ellis

Sheen

Murakami

Takashima

(2008). The effects of focused and unfocused written corrective feedback in an English as a foreign language context. System, 36, 353–371.

22.

Eschholz

(1980). The prose models approach: Using products in the process. In Donovan

McClelland

(Eds.), Eight approaches to teaching composition (pp. 21–26). National Council of Teachers of English.

23.

Evans

Hartshorn

Tuioti

(2010). Written corrective feedback: Practitioners’ Perspectives. International Journal of English Studies, 10, 47–77.

24.

Fathman

Whalley

(1990). Teacher response to student writing: Focus on form versus content. In Kroll

B.F.

(Ed.), Second language writing: Research insights for the classroom (pp. 178–190). Cambridge University Press.

25.

Ferris

(1999). The case for grammar correction in L2 writing classes: A response to Truscott (1996). Journal of Second Language Writing, 8, 1–10.

26.

Ferris

(2002). Treatment of error in second-language student writing. University of Michigan Press.

27.

Ferris

(2003). Response to student writing implications for second language students. Lawrence Erlbaum.

28.

Ferris

(2004). The ‘grammar correction’ debate in L2 writing: Where are we, and where do we go from here? (and what do we do in the meantime . . .?) Journal of Second Language Writing, 13, 49–62.

29.

Ferris

(2010). Second language writing research and written corrective feedback in SLA: Intersections and practical applications. Studies in Second Language Acquisition, 32, 181–201.

30.

Ferris

Roberts

(2001). Error feedback in L2 writing classes: How explicit does it need to be? Journal of Second Language Writing, 10, 161–184.

31.

Frear

Chiu

(2015). The effect of focused and unfocused indirect written corrective feedback on EFL learners’ accuracy in new pieces of writing. System, 53, 24–34.

32.

García Mayo

Labandibar

. (2017). The use of models as written corrective feedback in EFL writing. Annual Review of Applied Linguistics, 37, 110–127.

33.

Gilabert

Manchón

Vasylets

(2016). Mode in theoretical and empirical TBLT research: Advancing research agendas. Annual Review of Applied Linguistics, 36, 117–135.

34.

Han

Hyland

(2015). Exploring learner engagement with written corrective feedback in a Chinese tertiary EFL classroom. Journal of Second Language Writing, 30, 31–44.

35.

Hanaoka

(2006). Exploring the role of models in promoting noticing in L2 writing. JACET Bulletin, 42, 1–13.

36.

Hanaoka

(2007). Output, noticing, and learning: An investigation into the role of spontaneous attention to form in a four-stage writing task. Language Teaching Research, 11, 459–479.

37.

Hanaoka

Izumi

(2012). Noticing and uptake: Addressing pre-articulated covert problems in L2 writing. Journal of Second Language Writing, 21, 332–347.

38.

Kang

Han

(2015). The efficacy of written corrective feedback in improving L2 written accuracy: A meta-analysis. Modern Language Journal, 99, 1–18.

39.

Karim

Nassaji

(2020). The revision and transfer effects of direct and indirect comprehensive corrective feedback on ESL students’ writing. Language Teaching Research, 24, 519–539.

40.

Krashen

(1981). Second language acquisition and second language learning. Pergamon Press.

41.

Kyle

(2016). Measuring syntactic development in L2 writing: Fine grained indices of syntactic complexity and usage-based indices of syntactic sophistication. Unpublished doctoral dissertation, Georgia State University, Atlanta, GA, USA. Available at: http://scholarworks.gsu.edu/alesl_diss/35 (accessed September 2022).

42.

Kyle

Crossley

S.A.

(2015). Automatically assessing lexical sophistication: Indices, tools, findings, and application. TESOL Quarterly, 49, 757–786.

43.

Lalande

(1982). Reducing composition errors: An experiment. Modern Language Journal, 66, 140–149.

44.

Leki

(1991). The preferences of ESL students for error correction in college-level writing classes. Foreign Language Annals, 24, 203–218.

45.

(2010). The effectiveness of corrective feedback in SLA: A meta-analysis. Language Learning, 60, 309–365.

46.

Liu

Brown

(2015). Methodological synthesis of research on the effectiveness of corrective feedback in L2 writing. Journal of Second Language Writing, 30, 66–81.

47.

Long

(2015). Second language acquisition and task-based language teaching. Wiley-Blackwell.

48.

Long

Crookes

(1992). Three approaches to task-based syllabus design. TESOL Quarterly, 26, 27–56.

49.

(2010). Automatic analysis of syntactic complexity in second language writing. International Journal of Corpus Linguistics, 15, 474–496.

50.

Lynch

(2009) Responding to learners’ perceptions of feedback: the use of comparators in second language speaking courses. Innovation in Language Learning and Teaching, 3, 191–203.

51.

Manchón

(2011). Writing to learn the language: Issues in theory and research. In Manchón

(Ed.), Learning-to-write and writing-to-learn in an additional language. Amsterdam: John Benjamins.

52.

Martinez Esteban

Roca de Larios

. (2010). The use of models as a form of written feedback to secondary school pupils of English. International Journal of English Studies, 10, 143–170.

53.

McKinley

(2019). Evolving the TESOL teaching–research nexus. TESOL Quarterly, 53, 875–884.

54.

Muranoi

(2007). Output practice in the L2 classroom. In DeKeyser

(Ed.), Practice in a second language: Perspectives from applied linguistics and cognitive psychology (pp. 51–84). Cambridge University Press.

55.

Nassaji

(2010). The occurrence and effectiveness of spontaneous focus on form in adult ESL classrooms. Canadian Modern Language Review, 66, 907–933.

56.

Norris

Ortega

(2009). Towards an organic approach to investigating CAF in instructed SLA: The case of complexity. Applied Linguistics, 30, 555–578.

57.

Pienemann

(1989). Is language teachable? Applied Linguistics, 10, 52–79.

58.

Polio

(2012). The relevance of second language acquisition theory to the written error correction debate. Journal of Second Language Writing, 21, 375–389.

59.

Lapkin

(2001). Exploring the role of noticing in a three-stage second language writing task. Journal of Second Language Writing, 10, 277–303.

60.

Sachs

Polio

(2007). Learners’ uses of two types of written feedback on a L2 writing revision task. Studies in Second Language Acquisition, 29, 67–100.

61.

Schmidt

(1994). Deconstructing consciousness in search of useful definitions for applied linguistics. AILA Review, 11, 11–26.

62.

Schmidt

(2001). Attention. In Robinson

(Ed.), Cognition and second language instruction (pp. 3–32). Cambridge University Press.

63.

Sheen

(2007). The effect of focused written corrective feedback and language aptitude on ESL learners’ acquisition of articles. TESOL Quarterly, 41, 255–283.

64.

Skehan

(2001). Tasks and language performance. In Bygate

Skehan

Swain

(Eds.), Research pedagogic tasks: Second language learning, teaching, and testing. Longman.

65.

Skehan

Foster

(2001). Cognition and tasks. In Robinson

(Ed.), Cognition and second language instruction (pp. 183–205). Cambridge University Press.

66.

Storch

(2018). Written corrective feedback from sociocultural theoretical perspectives: A research agenda. Language Teaching, 51, 262–277.

67.

Storch

Wigglesworth

(2010). Learners’ processing, uptake, and retention of corrective feedback on writing. Studies in Second Language Acquisition, 32, 303–334.

68.

Swain

(1985). Communicative competence: Some roles of comprehensible input and comprehensible output in its development. In Gass

S.M.

Madden

C.G.

(Eds.), Input in second language acquisition (pp. 235–253). Newbury House.

69.

Swain

(1995). Three functions of output in second language learning. In Cook

Seidlhofer

(Eds.), Principle and practice in applied linguistics: Studies in honor of H.G. Widdowson (pp. 125–144). Oxford University Press.

70.

Tool for the Automatic Analysis of Lexical Sophistication (TAALES). (2015). [computer software]. Available at: https://www.linguisticanalysistools.org (accessed September 2022).

71.

Tool for the Automatic Analysis of Syntactic Sophistication and Complexity (TAASSC). (2016). [computer software]. Available at: https://www.linguisticanalysistools.org (accessed September 2022).

72.

Tomlin

Villa

(1994). Attention in cognitive science and second language acquisition. Studies in Second Language Acquisition, 16, 183–203.

73.

Truscott

(1996). The case against grammar correction in L2 writing classes. Language Learning, 46, 327–369.

74.

Truscott

(1999a). What’s wrong with oral grammar correction. Canadian Modern Language Review, 55, 437–455.

75.

Truscott

(1999b). The case for ‘The case against grammar correction in L2 writing classes’: A response to Ferris. Journal of Second Language Writing, 8, 111–122.

76.

Truscott

(2007). The effect of error correction on learners’ ability to write accurately. Journal of Second Language Writing, 16, 255–272.

77.

Truscott

Hsu

(2008). Error correction, revision, and learning. Journal of Second Language Writing, 17, 292–305.

78.

Van Beuningen

. (2010). Corrective feedback in L2 writing: Theoretical perspectives, empirical insights, and future directions. International Journal of English Studies, 10, 1–27.

79.

Van Beuningen

De Jong

Kuiken

. (2008). The effect of direct and indirect corrective feedback on L2 learners’ written accuracy. International Journal of Applied Linguistics, 156, 279–296.

80.

Van Beuningen

De Jong

Kuiken

. (2012). Evidence on the effectiveness of comprehensive error correction in second language writing. Language Learning, 62, 1–41.

81.

Watson

(1982). The use and abuse of models in the ESL writing class. TESOL Quarterly, 16, 5–14.

82.

Williams

(2012). The potential role(s) of writing in second language development. Journal of Second Language Writing, 21, 321–331.

83.

(2009). Overgeneralization from a narrow focus: A response to Ellis et al. (2008) and Bitchener (2008). Journal of Second Language Writing, 18, 270–275.

84.

Yang

Zhang

(2010). Exploring the role of reformulations and a model text in EFL students’ writing performance. Language Teaching Research, 14, 464–484.

85.

Yang

Weigle

(2015). Different topics, different discourse: Relationships among writing topic, measures of syntactic complexity, and judgments of writing quality. Journal of Second Language Writing, 28, 53–67.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.02 MB

1.46 MB

0.03 MB

0.01 MB

0.28 MB

0.00 MB

0.02 MB