Abstract
Introduction
English-language teaching (ELT) materials are instrumental in shaping the learning process and outcomes in English as a Foreign Language (EFL) contexts (Ryu & Jeon, 2020; Tomlinson & Masuhara, 2018; Vitta, 2023). While ELT textbooks serve as the primary medium for imparting language knowledge and skills to students (Hughes, 2019), reading in a foreign language can significantly affect academic performance (Roussel et al., 2017). In this context, evaluating ELT material has emerged as a central focus of textbook studies worldwide (Graves, 2019; Hoang & Crosthwaite, 2024; Tomlinson, 2012), including in China (G. Yang & Chen, 2013). Despite the burgeoning interest in textbook evaluation (H. Zhang et al., 2021), existing research predominantly relies on evaluators’ interpretations of the checklist of evaluation criteria and other subjective methods such as questionnaire surveys and interviews, and thus lacks objectivity (Cheng & Zhao, 2021; Tomlinson, 2012). Consequently, there remains a notable research gap in the systematic evaluation of the syntactic complexity and text readability of ELT materials, especially those designed for English majors at an intermediate proficiency level. Text readability and syntactic complexity serve as objective measures of text difficulty, providing scientific evidence for textbook compilation and selection.
Analyzing the text difficulty of textbooks—defined as the degree of accessibility of the text to readers—allows educators to match instructional materials with students’ current competence levels, ensuring that texts are neither too easy nor excessively challenging (Amendum et al., 2018). This alignment supports incremental language learning and helps prevent learner frustration and disengagement (Vega et al., 2013; Y. H. Yang et al., 2021). Regarding the evaluation of ELT materials, text difficulty is typically assessed at various levels, including lexical, syntactic, and textual aspects (Chen, 2016; Lei & Shi, 2023). Among these indicators, syntactic complexity and text readability are two key factors. Syntactic complexity, a critical determinant of text difficulty, plays a significant role in the development and selection of appropriate textbooks. Specifically, text difficulty, as measured by the syntactic complexity in textbooks, is widely considered to increase linearly with the progression of learners’ linguistic competence (Chen, 2016). This premise is grounded in Krashen’s (1985) input hypothesis which posits that input slightly exceeding the learner’s current linguistic level constitutes the ideal input for promoting second language development and acquisition. Existing research has extensively examined the text complexity of different genres, such as academic papers and learner texts (Gedik & Kolsal, 2022; Wu et al., 2020) which may exhibit various syntactic complexity patterns (Hwang et al., 2020). In addition, the factors influencing syntactic complexity have been a significant focus of scholarly inquiry (Bulté & Housen, 2012). However, there is currently a paucity of empirical studies evaluating ELT materials including various types of text. In particular, the syntactic complexity of ELT textbooks designed for English majors with intermediate language proficiency has not been adequately explored.
Text readability has long been on the research agenda of textbook evaluation studies (Hakim et al., 2021). Existing research has explored the pedagogical functions, measurements, and factors influencing text readability. Text readability is widely recognized as an essential factor contributing to learners’ academic performance (Peng, 2015). Textbooks with low readability may trigger learners’ negative emotions, which impede their internalization of input (Krashen, 1985). Analyzing the readability of textbooks offers a valuable reference for textbook compilation and curriculum design (Hakim et al., 2021; J. Hu et al., 2021; Zamanian & Heydari, 2012). In addition, scholarship in this regard has mostly centered on the measurement of text readability (Sato et al., 2008; Yeung et al., 2018; Zamanian & Heydari, 2012) and factors that influence text readability (Eslami, 2014). However, there is a lack of focus on the nuanced and scientific measurement of text readability (Plakans & Bilkis, 2016), particularly in textbooks for university learners. In addition, syntactic complexity and text readability are inextricably connected (Crossley, 2024). Despite this connection, the relationship between the syntactic complexity and text readability remains unclear. Understanding this interplay is crucial because it enables educators to assess the appropriateness of educational materials for learners. This comprehension helps strike a balance between linguistic richness and accessibility in intensive reading textbooks, ensuring that the materials are challenging yet understandable, thereby fostering more effective and engaging learning experiences for students.
Given the paramount importance and the gaps mentioned above, the present study proposes to explore the syntactic complexity, text readability, and their relationship in a series of
Literature Review
Studies of Material Evaluation
Material evaluation has attracted significant attention in recent decades (Hanifa, 2018; Mukundan & Nimehchisalem, 2012; M. Yang & Shi, 2020). Both theoretical and methodological efforts have been made to explore material evaluation (Sheldon, 1988; Tomlinson, 2012).
The following theoretical orientations have emerged in textbook evaluation studies: classroom-based evaluation (i.e., Ellis, 1997), language-focused evaluation (i.e., Crandall & Basturkmen, 2004), culture-emphasized evaluation (i.e., M. Yang & Shi, 2020) and sociolinguistics-oriented evaluation (i.e., Atar & Erdem, 2020). Material evaluation can be predictive of textbook development and retrospective evaluation of existing textbooks (Ellis, 1997). These theoretical approaches echo what is known as critical reflection in material evaluation studies (M. Yang & Wang, 2024). Critical reflection studies focus on reviewing previous research on textbook evaluation, designing textbook evaluation criteria, and developing instruments to evaluate textbooks (W. Hu, 2024; Tomlinson, 2020, 2022). This endeavor equips us with profound theoretical insights into material evaluation, but it seems to lack empirical support (H. Zhang et al., 2021).
Previous empirical studies on textbook evaluation can be analyzed from both macro and micro perspectives (M. Yang & Wang, 2024). Macro-empirical studies on textbook evaluation have explored the holistic design of textbooks, particularly in terms of text selection and agreement with the curriculum, but failed to consider contextual divergences (Gholami et al., 2017; W. Zhang, 2014). Micro-empirical studies on textbook evaluation have examined linguistic content, involving lexical, grammatical, listening, and speaking elements (Chen, 2016; Hoang & Crosthwaite, 2024), as well as non-linguistic content, including culture, learner autonomy, ideology, and values. Existing micro-empirical research on textbook evaluation mainly takes the following two paths: The first path compares textbook corpora for EFL speakers with those designed for native speakers, intending to unveil the disparities from authentic communication situations in these EFL textbooks (Miller, 2011; Molavi et al., 2014). The second path explores the appropriateness of EFL textbooks, focusing on the coverage and frequency of words and phrases as stipulated in the curriculum (i.e., Liu & Zhang, 2015).
Early research has enriched our understanding of the vocabulary and phrases in these textbooks and provided evidence for their justified application. However, previous studies on material evaluation primarily focused on two issues that must be addressed. First, these studies mainly involved holistic and qualitative evaluations (W. Zhang, 2014). They relied on subjective means, such as questionnaires and interviews, to evaluate textbooks and, thus, exhibited lack of objectivity (Cheng & Zhao, 2021). Text readability and syntactic complexity can offer objective means of measuring text difficulty, thus providing evidence for the compilation and selection of textbooks. Second, there is currently limited attention given to syntactic complexity and text readability of textbooks. Textbooks serve as a major source of learner input; therefore, examining their syntactic complexity and text readability is essential for optimizing the language learning process (Abdollahi-Guilani, 2022; W. Hu, 2024).
Studies of Syntactic Complexity in Textbooks
Syntactic complexity refers to the diversity and complexity of syntactic structures in language production; that is, the degree of syntactic complexity and variation (Ortega, 2003). As a crucial indicator of text difficulty (Huang & Zheng, 2022), syntactic complexity plays a significant role in evaluating the appropriateness of textbooks, which serve as the primary source of input in second language (L2) development. While numerous components of textbooks have been explored, relatively little attention has been paid to the quality of textual input per se, particularly syntactic complexity. It is emphasized that text difficulty should align with readers’ competence by providing input that slightly exceeds their current level, a concept known as the “i + 1” principle (Krashen, 1985). Notably, texts with syntactically complexity pose considerable challenges to learners’ comprehension (Frantz et al., 2015). In addition, selecting and adapting textbooks with appropriate levels of text difficulty, as measured by syntactic complexity, is essential for achieving ideal learning outcomes (Spencer & Wagner, 2017). As a significant component in evaluating the comprehensibility of textbooks, and since syntactic modification is a critical part of the frequently adopted methods for adapting teaching materials (Berendes et al., 2018), it is of great significance to investigate syntactic complexity in textbooks. Without the meticulous and methodological orchestration of textbook difficulty across readers’ progressively increasing levels, the desired goal of competence development can hardly be attained (J. Song & Kim, 2021).
Motivated by these claims, extensive research has evaluated the syntactic complexity of various quantitative indices to increase the reliability and validity of evaluation methods. Currently, two indicators are commonly used to measure syntactic complexity: large- and fine-grained indices. The former measures overall sentential or clausal complexity (e.g., the length of the sentence), but fails to disclose the granularity of a specific language structure. Although there is general consensus regarding the positive relationship between measures such as mean length of T-unit (MLTU) and L2 development (Norris & Ortega, 2009), the interpretation of MLTU remains unclear, as various linguistic structures (e.g., phrasal dependents) can trigger an increase in the unit’s length. Therefore, more fine-grained indices have been proposed owing to developments in corpus linguistics. For instance, Biber et al. (1999) examined the clausal and phrasal features of academic writing using the Biber Tagger. Other computational tools have also been adopted, such as T.E.R.A (Solnyshkina et al., 2017), the Coh-Metrix (Ryu & Jeon, 2020), and the L2SCA (Y. Li et al., 2022). Among these tools, L2SCA is widely accepted for its robustness and operationalisability (Y. Li et al., 2022; Wu et al., 2020). Therefore, the present study utilizes L2SCA to explore the nuanced syntactic complexity of a series of
Previous research has predominantly focused on isolated genres or specific types of text, creating a gap in understanding how syntactic complexity manifests in educational materials designed for language learners. Extensive research has been conducted on the syntactic complexities of various genres. For instance, researchers have explored syntactic complexity from perspectives such as textual genres and registers, including argumentative (Y. Li et al., 2022), expository, and narrative essays, academic genres (Hwang et al., 2020; Larsson & Kaatari, 2020; Verdiansyah et al., 2020; X. Zhang & Li, 2022), and translation texts (Lin et al., 2023). Previous research offers a holistic overview of the factors that influence syntactic complexity in specific genres. However, empirical studies on textbook materials that include various types of text remain scarce. To the best of our knowledge, what remains unclear is precisely how syntactic complexity affects English language textbooks, as it compromises different types of text, such as narration, argumentation, and exposition, posing severe challenges to learners.
Among the limited research on the syntactic complexity of textbooks, existing studies mainly focus on primary, middle, and high school textbooks, with little attention given to college-level textbooks. For instance, Arai et al. (2017) analyzed the syntactic complexity of primary school textbooks. Solnyshkina et al. (2017) explored the complexity of eight Russian English textbooks. Similarly, Ryu and Jeon (2020) analyzed text difficulty across grades in Korean middle school English textbooks using the Coh-metrix. Gedik and Kolsal (2022) explored the syntactic complexity deficiencies in textbooks used for preparing for high school and college entrance examinations. These studies suggest that the syntactic complexity of textbooks is related to learners’ education levels. Syntactic complexity is designed to address individual differences among learners at various levels of their education. Learner variables include language competence, memory capacity, and motivation. For instance, textbooks for advanced learners (Zheng, 2018) and preliminary-level learners (Hwang et al., 2020) displayed diverse features of syntactic complexity. However, a review of the relevant literature indicates that research has primarily focused on the features of syntactic complexity of textbooks intended for primary and middle school learners. There has been a significant lack research exploring the features of syntactic complexity in textbooks designed for learners of tertiary education. To address this gap, the current research intends to explore the syntactic complexity patterns of English majors, who are typically at an intermediate level of English language proficiency.
Studies of Text Readability in Textbooks
Text readability refers to the level of ease with which a text can be read and understood (Goodman & Flurkey, 2019). The readability of textbooks can affect students’ academic performance. It has been demonstrated that less readable textbooks tend to result in lower average grades for the associated courses (Peng, 2015). Moreover, challenging textbooks might cause students to fail to comprehend the material, leading to frustration, and could also place a burden on instructors, who must ensure that are understood when explaining the content to learners (Peng, 2015). Learners in such situations, filled with anxiety, uncertainty, and threats may become demotivated when receiving and processing input (Krashen, 1985; C. Li et al., 2024). Therefore, it is incumbent upon instructors to carefully select appropriate textbooks that are readable for learners.
Given the importance of textbook readability, researchers have primarily examined its measurement (Yeung et al., 2018) and the factors influencing text readability (Eslami, 2014). The most common methods used to measure readability are formulas. Nevertheless, these measurement formulas are often criticized for their lack of reliability, as the results are often determined by instructors’ intuition (Plakans & Bilkis, 2016). In contrast, examining the readability of textbooks through data analytics could improve the quantification of textbook evaluation and promote the reading development of preliminary English as a second language learners (Kasule, 2011). The most commonly used formulas currently include the FRE, FKG, ARI, CLI, GFI, and SMOG (Yeung et al., 2018). These six formulas were used in this study. Existing research on the readability of textbooks mainly focuses on the readability of textbooks at the primary and middle school levels in EFL contexts, such as in Indonesia (Hakim et al., 2021) and Hong Kong, and mainland China (J. Hu et al., 2021). However, textbooks for university students, particularly for tertiary English majors, remain poorly understood. Since analyzing the readability of textbooks can provide scientific evidence to inform textbook compilation and accommodate learners at different levels of education (Hakim et al., 2021; J. Hu et al., 2021), it is necessary to further investigate the readability features of textbooks for tertiary English majors.
Despite the extensive research on readability, most studies have focused primarily on the lexical features of textbooks, leaving a significant gap in our understanding of the impact of syntactic complexity on readability. Researchers have explored the various factors influencing text readability, revealing that it is shaped by both linguistic and non-linguistic factors (Bailin & Grafstein, 2016). Reading motivation (Goodman & Flurkey, 2019), cultural background (Bailin & Grafstein, 2016) and reading environment have been identified as important variables in determining readability. In addition, readability is influenced by linguistic factors such as word length, the proportion of different word classes, sentence length (Bailin & Grafstein, 2016), pronouns, the number of syllables (Sung et al., 2015), and the numbers of affixes, prepositional phrases, and others (Bailin & Grafstein, 2016). In particular, syntactic-related linguistic features exert a significant influence on readability (Eslami, 2014). These features include the complexity of sentence structures, the use of dependent clauses, and the overall syntactic arrangement of a text, all of which can significantly affect how easily a reader comprehends and processes the material (Eslami, 2014). However, the predominant focus of existing research has been on the lexical features of textbooks. For instance, Y. Wang (2021) explored the relationship between lexical complexity, measured by the diversity and sophistication of vocabulary, and the readability of textbooks for English majors. J. Hu et al. (2021) investigated readability in terms of lexical coverage, or the extent to which words in a text are known by the target audience in science textbooks.
This lexical focus has provided valuable insights but also highlights a critical gap in the literature regarding syntactic complexity. Although lexical features are undoubtedly important, they do not capture the full scope of what makes text readable. Syntactic complexity, which encompasses elements such as sentence length, the ratio of dependent to independent clauses, and the use of varied syntactic structures, plays a crucial role in readability. Texts with complex syntactic structures may be challenging for readers, even if the vocabulary is relatively simple. Therefore, a comprehensive understanding of readability must include analyses of both syntactic complexity and lexical features. This gap in current research emphasizes the need for further exploration of how syntactic complexity impacts textbook readability, with the aim of providing a more holistic and nuanced understanding that can inform the development of educational materials. Furthermore, considerable uncertainty remains regarding the relationship between syntactic complexity and textbook readability, which warrants further exploration.
Syntactic Complexity and Text Readability in Textbooks
The relationship between syntactic complexity and text readability has been extensively explored, as it is considered a crucial factor in understanding and interpreting a text. Syntactic complexity has been identified as an important aspect of text readability because it can affect the ease with which readers comprehend a text (Khademizadeh & Vaezi, 2020).
Early studies have focused on sentential features when calculating readability scores (Frantz et al., 2015; Wu, 2017). Readability is closely associated with syntactic complexity (Frantz et al., 2015). It is important to note that the accuracy of measuring readability is undermined if syntactic complexity is not considered (Xing & Cheng, 2010). Furthermore, research has shown that syntactic complexity has a critical influence on the readability of academic papers written by Chinese scholars (Wu, 2017). This influence is also evident in the development of reading materials and textbooks (Khademizadeh & Vaezi, 2020). Therefore, it is essential to examine the relationship between syntactic complexity and readability in textbooks, as textbooks are considered the primary source of input in second language learning and play a vital role in the language development of students (Peng, 2015). However, only a few studies have explored the relationship between syntactic complexity and text readability, particularly in textbooks (Wu, 2017). Thus, exploring the relationship between syntactic complexity and readability in textbooks is crucial as it could provide valuable insights into the development of effective teaching materials and curriculum design for students at different proficiency levels (Crossley et al., 2008; Hakim et al., 2021; J. Hu et al., 2021). In line with this rationale, this study aims to investigate the relationship between syntactic complexity and readability in textbooks.
In summary, the aforementioned literature review reveals that early studies mainly examined the syntactic complexity and readability of written texts, such as learner texts and published academic papers, but paid inadequate attention to the features in these two aspects in textbooks. It is also indicated that previous studies primarily focused on lexical levels, but gave little attention to syntactic complexity and readability of textbooks (Tang & Liang, 2021). Third, past research has mostly focused on textbooks for primary and middle school students, but little is known about textbooks for tertiary English majors. It has been reported that the difficulty of English textbooks for Chinese universities does not align with their content volume and lacks generic diversity (H. Zhang et al., 2021). To address these gaps, the present study aims to examine syntactic complexity, text readability, and their relationship with a series of
Research Design
Research Questions
To characterize the syntactic complexity and text readability of a series of
Research Question 1: What are the features of syntactic complexity in
Research Question 2: What are the features of text readability in
Research Question 3: How does syntactic complexity predict text readability in
Source of the Corpus
The corpus for this study was based on a series of
To ensure that the corpus accurately reflected the syntactic characteristics of the complete text, each text was sampled in its entirety. This approach was adopted to account for different stylistic features that may have been present at the beginning, throughout the narrative process, and at the end of the article. The main text in each unit was converted into .txt format, and irrelevant words in the pictures and headings were excluded. To ensure the reliability of the text conversion during the corpus design process, two professors, who had been using the selected textbooks for the past decade were consulted. Four teaching assistants with master’s degrees in applied linguistics were recruited to review and cross-check the converted texts. The resulting dataset had a size of 1.56 MB and contained 111,723 words for further data analysis. Specifically, Book 1 of the selected series of Intensive Reading textbooks includes 20,682 words, Book 2 contains 26,450 words, Book 3 contains 31,618 words, and Book 4 contains 32,973 words.
Measurement Indicators and Instrument
Data mining is an interdisciplinary practice that refers to the process of unveiling hidden patterns of behaviors from large datasets (CheshmehSohrabi & Mashhadi, 2023). This involves a hybrid application of machine learning, pattern recognition, and statistics. Although data mining techniques have been widely applied across various disciplines (Cope & Kalantzis, 2016), their application in the fields of education, applied linguistics, and language learning has only recently emerged (Warschauer et al., 2019). For instance, data mining techniques have been employed to understand learner behavior by analyzing syntactic complexity (Lu, 2010; X. Zhang & Lu, 2022) and text readability (Crossley et al., 2008) in L2 writing research. Previous research has demonstrated the potential of applying data mining techniques to explore learner behavior in the fields of language education and applied linguistics. Therefore, this study proposes to employ data mining techniques, such as L2SCA (Lu, 2010) and FRE, FKG, ARI, CLI, GFI, and SMOG (Yeung et al., 2018) to explore syntactic complexity and text readability of a series of
Measures and Instrument for Syntactic Complexity
Compared with large-grained measures of the overall complexity of syntactic structures, fine-grained indicators accurately assess specific language structures, such as the length of complex nominal phrases (Kyle & Crossley, 2018). Of these nuanced measurements, the L2SCA, developed using Python, has been widely accepted due to its robustness and operationalisability (Y. Li et al., 2022). This has been empirically validated through subsequent research on L2 writing, second language acquisition, and textbook studies (Lu, 2010; X. Zhang & Lu, 2022). Therefore, the present study adopts the L2SCA to explore the syntactic complexity of a series of
The L2SCA (accessible at http://www.personal.psu.edu/xxl13/downloads/l2sca.html), as an automatic tool for measuring L2 syntactic complexity, encompasses the following 14 indicators: mean length of sentence (MLS), mean length of T-unit (MLT), mean clause length (MLC), clause per sentence (C/S), verb phrases per T-unit (VP/T), clause per T-unit (C/T), dependent clauses per clause (DC /C), dependent clauses per T-unit (DC/T), T-units per sentence (T/S), complex T-unit per T-unit (CT/T), coordinate phrases per T-unit (CP/T), coordinate phrases per clause (CP/C), complex nominals per T-unit (CN/T), and complex nominals per clause (CN/C). These indicators are utilized for the present study to measure syntactic complexity of the texts included in the selected textbooks.
Measures and Instrument for Text Readability
The instruments applied for the present study to quantify text readability are several software tools commonly used in this field (Yeung et al., 2018). These instruments are available at https://we.sflep.com/research/ReadingEase.aspx, sponsored by
The measures employed in the present study to assess text readability include the following six indicators: FRE (Cantos Gómez & Sánchez Lafuente, 2019), FKG (J. Hu et al., 2021), ARI (De Oliveira et al., 2015), CLI (Cantos Gómez & Sánchez Lafuente, 2019), GFI (De Oliveira et al., 2015), and SMOG (Cantos Gómez & Sánchez Lafuente, 2019).
These measures have been applied as essential indicators in studies of text readability across various disciplines, such as health care, information science (Lei & Yan, 2016), education (Cantos Gómez & Sánchez Lafuente, 2019), psychology (Amendum et al., 2018), and tourism (Dolnicar & Chapple, 2015), and others. Recently, these indicators have been widely examined in academic contexts, languages, and linguistic studies (Y. Wang, 2021; Yeung et al., 2018). Research on text readability from the perspective of these indicators in academic writing (S. Wang et al., 2022) and textbooks (Y. Wang, 2021) has validated the reliability of these common formulas for calculating text readability. Therefore, the present study proposes the following measures to explore the text readability of a series of
where L represents the average number of letters per 100 words and S represents the average number of sentences per 100 words: The resulting value approximately corresponds to the grade levels of American primary and middle schools.
Data Analysis
To answer Research Question 1, which explored the syntactic complexity characteristics of the selected series of
Similarly, to address Research Question 2, which enquired about the features of readability of the selected series of textbooks, six indicators of text readability were calculated, as introduced in Section 3.3.2. The readability features of the selected series of textbooks were then quantified to examine the patterns of text readability inBooks1through 4. To this end, the converted texts were entered into a website (https://we.sflep.com/research/ReadingEase.aspx). To gain access to this website, we contacted the sponsor of the China Foreign Language Teaching Network for getting permission. Upon obtaining consent, we registered the texts on the website and input them to calculate the readability of each book in terms of six indicators: FRE, FKG, ARI, CLI, GFI, and SMOG. To ensure the reliability of our data analysis, we consulted a PhD candidate in computing science on how to operate the software and interpret the results of the six formulas. Subsequently, we calculated the text readability of each of the four books through the website and recalculated it after 1 month to check the accuracy of the analysis results. The results were processed in an Excel file for subsequent analyses.
The data, in Excel format, derived from analyzing the first two research questions, were then computed into SPSS 27.0 to address Research Question 3. This question investigates how syntax influences the text readability of the selected series of
Results
Syntactic Complexity of the Selected Intensive Reading Textbooks for English Majors
Table 1 reports the distribution of syntactic complexity in the selected series of
Syntactic Complexity of the Selected
It is noteworthy that the mean sentence length (MLS) for Book 2 decreases to 11.0762 from 11.2341 for Book 1, but increases to 13.9103 for Book 3 and 16.7376 for Book 4. A slight decrease was also noticed in clauses per sentence (C/S) in Book 2, with a value of 1.4125, which is lower than the 1.554 in Book 1. For the same indicator, the number increased to 1.7215 for Book 3 and 1.8898 for Book 4. Similarly, the T-units per sentence (T/S) for Book 2 decreased from 1.0788 in Book 1 to 0.9062 in Book 2, but increased to 1.0189 in Book 3 and 1.1096 in Book 4.
Distinguished from the three indicators mentioned above, the remaining 11 showed a steady increase from Books 1 to 4. Specifically, the mean length of T-units (MLT) for the four selected textbooks increased from 10.4139 in Book 1 to 12.2227 for Book 2, 13.6520 for Book 3, and 15.0837 for Book 4. The mean clause lengths (MLC) from Books 1 to 4 were 7.2289, 7.8147, 8.0802, and 8.8566, respectively. In addition, verb phrases per T-unit (VP/T) increment increased from 1.6687 in Book 1 to 2.1404 in Book 4. The number of clauses per T-unit (C/T) increases from 1.4406 in Book 1 to 1.7031 in Book 4. This steady growth is also observed in dependent clauses per clause (DC/C). For this indicator, the number increased from 0.3763 in Book 4 to 0.2702 in Book 1. The number of dependent clauses per T-unit (DC/T) increased from 0.3892 in Book 1 to 0.6409 in Book 4. The number of complex T-unit per T unit (CT/T) increased from 0.2915 in Book 1 to 0.4286 in Book 4. Similarly, the value for coordinate phrases per T-unit (CP/T) increased from 0.3632 in Book 4 to 0.1964 in Book 1; the coordinate phrases per clause (CP/C) moved up to 0.2133 in Book 4 from 0.1363 in Book 1; and the complex nominals per T-unit (CN/T) add up to 1.6121 for Book 4 from 0.8882 for Book 1. The number of complex nominals per clause (CN/C) increased from 0.6166 in Book 1 to 0.9465 in Book 4.
Text Readability of the Selected Intensive Reading Textbooks for English Majors
Table 2 presents the results for the six indicators of text readability for the selected series of
Text Readability of the Selected
Specifically, the FRE values for the four books are 89.32 (Book 1), 80.47 (Book 2), 70.66 (Book 3), and 73.77 (Book 4), respectively. It can be observed that Book 1 and Book 2 fall into the
A similar pattern of decreasing readability was observed with the FKG and ARI indicators. The FKG indices for Book 1 to Book 4 are 3.91 (Book 1), 5.55 (Book 2), 7.00 (Book 3), and 7.20 (Book 4) respectively. This suggests that the readability of the four books broadly corresponds to grades 4, 6, and 7 for American students. The average index of this indicator for the four books was 5.92, suggesting that the series of selected textbooks is roughly equal to the Grade 6 level of American students. Similarly, Table 3 shows that the indices of ARI for the four books are 4.13 (Book 1), 6.06 (Book 2), 7.77 (Book 3), and 8.17 (Book 4) respectively. These readability values suggest that the readabilities of the four books equal to the level of Grade 4, 6, 8 and 8 of American learners respectively. The average readability across all four books in this indicator was 6.53, which is roughly equivalent to Grade 7 for American learners.
Correlation Coefficients (r) Between Syntactic Complexity and Text Readability.
The fourth indicator (CLI) also shows a decreasing trend in readability from Books 1 to 4. The readabilities of the four books are 5.80 (Book 1), 7.21 (Book 2), 8.49 (Book 3), and 8.61 (Book 4) respectively. The average number of indicators for the four books was 7.53. These values indicate a rough equivalence between the readability of the four books and the entire series, corresponding to the levels of Grades 6, 7, 8, 9, and 8 for American learners, respectively.
The fifth and sixth indicators, the GFI and SMOG, also display a decreasing pattern of readability. For instance, the readabilities for the four books in relation to GFI are 6.73 (Book 1), 8.49 (Book 2), 10.14 (Book 3), and 10.22 (Book 4), respectively. With regard to SMOG, the respective readabilities for the four books are 6.95 (Book 1), 8.44 (Book 2), 9.51 (Book 3), and 9.49 (Book 4). The average indices of these two indicators across the entire series are 8.90 and 8.60, respectively. These values indicate that the readability of these books corresponds to the levels of grades 7, 8, 10, and 10for American learners.
Influence of the of Syntactic Complexity as a Predictor Variable on Text Readability of the Selected Textbooks
To explore the relationship between syntactic complexity (SC) and text readability of the selected textbooks, Pearson’s correlation was first conducted using SPSS 27.0. Table 3 presents the correlation coefficients between syntactic complexity and text readability.
As shown in Table 3, all 14 indicators of syntactic complexity are negatively correlated with FRE, one of the readability indicators, and positively correlated with FKG, ARI, CLR, FOG, and SMOG, the other five readability indicators. Among the correlation coefficients between the 14 indicators of syntactic complexity and the six indicators of readability, those between CN/T and each of the six indicators of readability were the highest. Specifically, the correlation coefficient between CN/T and FRE is −.814, with .928 between CN/T and FKG, .923 between CN/T and ARI, .829 between CN/T and CLR, .926 between CN/T and FOG, and .911 between CN/T and SMOG. All of these correlation coefficients were higher than those between the remaining 13 indicators of syntactic complexity and the six indicators of readability. These results indicate that among the 14 indicators of syntactic complexity, CN/T had the strongest significant correlation with the six indicators of readability.
The significant correlations between the 14 indicators of syntactic complexity and the six indicators of readability highlight the feasibility of conducting multiple regression analysis to examine how syntactic complexity influences readability. Therefore, using the 14 indicators of syntactic complexity as independent variables and the six indicators of text readability as dependent variables, a multiple regression analysis was performed to explore the extent to which syntactic complexity accounts for readability, addressing the third research question.
Table 4 reports the model summaries from the multiple stepwise regression analysis. It shows that all 14indicators of syntactic complexity displayed a predictive effect on the text readability of the selected series of
Model Summary of the Multiple Regression Analysis.
Predictors (Constant): CN/C, T/S, C/T, CP/C, VP/T, CT/T, MLC, DC/C, CN/T, DC/T, CP/T, MLS, C/S, MLT.
Fourteen syntactic complexity indicators were included as independent variables in the regression model. A Multiple Regression Analysis examining the predictive effect of these 14indicators of syntactic complexity on the six indicators of text readability yielded six prediction model expressions, as shown (Table 5). Among the 14 indicators upon FRE, C/S is the strongest predictor of FKG among the 14 indicators of syntactic complexity, while also most strongly predicting CLR; C/T has the strongest predictive effect among the 14 indicators in predicting ARI; C/S, in contrast with the other 13 indicators of syntactic complexity, most strongly predicts CLR; C/T has the strongest predictive effect among the 14 indicators on FOG; and C/T is the strongest factor among the 14 indicators in predicting SMOG. In summary, these six regression formulas show that C/T (clauses per T-unit), C/S (clauses per sentence), and DC/C (dependent clauses per clause) are the three strongest indicators for predicting text readability of the selected series of
B Coefficients of the Multiple Regression Analysis.
Discussion
This study explored the features of syntactic complexity, readability, and the predictive relationship between them in a selected series of
Specifically, this study explored the features of syntactic complexity in a selected series of
Additionally, the results for the first research question indicated certain problems with the selected series of textbooks. While 11 of the 14 indicators of syntactic complexity consistently revealed a systematic progression in difficulty from Books 1 to 4, three indicators—MLS, C/S, and T/S—demonstrated certain divergence. That is, these three indicators experienced a slight decrease in Book 2 compared to Book 1, followed by a continuous increase from Books 2 to Book 3 and Book 4. These results indicate a violation of the principle of progressive difficulty in textbook compilation for the three indicators, as previously identified in EFL textbooks (Jin et al., 2020; Ryu & Jeon, 2020) and in studies on Chinese as a Second Language textbooks (Cao et al., 2022). This result partially corroborates previous research showing that ELT textbooks for Chinese English majors lack a reasonable distribution of text difficulty (H. Zhang et al., 2021). This result further indicates that the selected series of textbooks was problematic in its compilation of these three indicators. Therefore, it is imperative for developers to consider these three syntactic complexity factors. This need arises because MLS, C/S, and T/S are effective indicators of text difficulty (Graesser et al., 2007; Lu, 2010), cognitive load, competence in mental processing (Bonzo, 2008), and strategies of textbook development (Lei & Shi, 2023).
While the first research question revealed an increasing trend in syntactic complexity, the second research question of the present study suggested a decreasing trend in text readability from Books 1 to 4 in the selected series of textbooks, indicating an increasing progression of difficulty (J. Song & Kim, 2021). This result aligns with the syntactic complexity reported earlier in this study, highlighting that syntactic features such as the number and length of sentences, complexity of vocabulary, and number of syllables are closely related to readability (Bailin & Grafstein, 2016; Sung et al., 2015). It also reflects the logical progression of difficulty in textbook development and embodies the developmental process of foreign language acquisition (Chen, 2016).
Another finding associated with Research Question 2 is that the readabilities of Books 1 and 2 correspond to those of seniors in American primary schools and juniors in American middle schools. In comparison, the readabilities of Books 3 and 4 are equal to the levels of seniors in American middle schools and juniors in American high schools. Readability is closely related to factors, such as motivation (Goodman & Flurkey, 2019), and cultural background knowledge (Bailin & Grafstein, 2016). Considering that English functions as a foreign language in China, where EFL learners have limited opportunities to interact with native English speakers and cultures, achieving native-level reading proficiency can be challenging for Chinese EFL learners. Therefore, it is logical for textbooks to be designed with readability that caters to the specific context of English language learning in China.
A third noteworthy finding related to Research Question 2 is that, while four indicators—FKG, ARI, CLI, and GFI—consistently displayed a decreasing pattern of readability from Books 1 to 4, FRE and SMOG showed certain variations in Book 3. In other words, Book 3 exhibited the lowest FRE value and highest SMOG value, indicating the lowest degree of readability and correspondingly the highest degree of difficulty compared to the other three books. This result diverges from previous studies, which suggest that indicators, such as FRE, demonstrate a progressive distribution of readability in English textbooks (Ryu & Jeon, 2020). This result highlights issues with the selected series in terms of readability, as indicated by the FRE and SMOG. Since these two indicators are essential for the adaptation and development of textbooks and teaching materials adaptation and development (Im et al., 2015), it is important for textbook developers to consider them in further compilation.
Regarding the influence of syntactic complexity as a predictor variable on text readability (Research Question 3), the present study found that all 14 indicators of the former could predict the latter to some degree, with C/T (clauses per T-unit), C/S (clauses per sentence), and DC/C (dependent clauses per clause) being the strongest indicators in predicting text readability of the selected series of textbooks. These results suggest that subordinate structures strongly influence text readability (Eslami, 2014; Kyle & Crossley, 2018). This can be explained as follows: the above subordinated structures are actually embedded syntactic structures, which offer flexibility in expressing ideas and thus function as the major indicators of readability (Wu, 2017; X. Zhang & Li, 2022). In addition, the results regarding the predicative power of subordinated structures, such as C/T, C/S, and DC/C on readability contradict those of previous studies conducted in the fields of academic writing (S. Wang et al., 2022; Wu, 2017), extracurricular reading materials (Lei & Shi, 2023), and teaching resources used in primary and secondary schools (Jin et al., 2020). This inconsistency may indicate the unique features of the selected series of textbooks for tertiary English majors, thus warranting the need for further exploration.
Conclusion
This study found that 11 of the 14 indicators of syntactic complexity consistently demonstrated a systematic progression of difficulty from Books 1 to 4 of the selected series of textbooks. However, three indicators, including MLS, C/S, and T/S, showed opposite tendencies. In addition, the selected series lacked systematic readability, as evidenced by the mutation of the FRE and SMOG indicators in BOOK 3. Finally, the results indicate that all 14 indicators of syntactic complexity contributed to predicting readability to some extent. In particular, clause-related features, such as clause per T-unit (C/T), clause per sentence (C/S), and dependent clauses per clause (DC/C) are the strongest predictors of readability among the 14 indicators of syntactic complexity.
This study’s findings have several theoretical implications. First, the results demonstrate that syntactic complexity and readability can function as important methods in conducting quantitative material evaluations, thus mitigating the shortcomings of previous research, which were considered subjective and qualitative (Cheng & Zhao, 2021). For instance, the inclusion of 14 indicators could enrich the literature on syntactic complexity and provide empirical evidence forL2SCA in the context of textbook evaluation. In contrast to most previous studies, which included only one or two readability indicators, this study examined all six commonly recommended readability indicators. This extends the readability indicators explored in previous studies. In addition, this study examined the predictive effect of syntactic complexity on readability—an area that has been rarely addressed in existing literature (Wu, 2017). Therefore, this study expands the scope of the research in this field. Finally, unlike previous studies that primarily examined learners’ written texts, published academic papers, or focused on textbooks for primary and middle school students, this study expands its focus to university English majors, who remained underexplored.
This study has significant implications for the development of EFL teaching methods and textbooks. For instance, this study found that genre plays a crucial role in contributing to syntactic complexity. Thus, genre should be carefully considered when designing textbooks for learners in EFL and English-medium instruction (EMI) contexts, where teaching materials are aimed at learning content subjects in English (C. Li, 2023; Richards & Pun, 2022; Widodo et al., 2022). Second, this study found that the selected
It should be acknowledged that the present study has certain limitations. Firstly, given a large number of indicators proposed by different scholars, it is not feasible to cover all the measures in one study. Therefore, the findings generated by other indicators may vary. Second, this study focused exclusively on the features of syntactic complexity and readability in a series of
