Abstract
Keywords
Introduction
Autism spectrum disorder (ASD) is a developmental disorder caused by brain differences, and its core features include differences in social interaction and communication, as well as restricted and repetitive patterns of behavior, interests, or activities. The early interactions between caregivers and toddlers are crucial for toddlers’ early social environment. Notably, infants and toddlers later diagnosed with ASD often display a reduced tendency to seek, respond to, and initiate social experiences in early life. Such differences can disrupt their social interaction with caregivers (Dawson, 2008; Green et al., 2015; Hobson & Lee, 2010).
Caregiver Language Input
Caregiver language input not only acts as a key determinant of children's language acquisition but also structures and sustains interactions that shape multiple developmental milestones, including social communication (Adamson et al., 2015; Bottema-Beutel et al., 2014). Characterizing the features of caregiver input—and how these features may differ between caregivers of children with ASD and their typically developing (TD) counterparts—is equally relevant, as caregivers of children with ASD may adapt their speech in distinct ways to respond to their children's social and linguistic profiles (Nadig & Bang, 2016).
As a core component of children's early developmental environment, caregiver language input is typically characterized by two dimensions: structural and functional features. The structural aspects of caregiver input include complexity (e.g., mean length of utterance [MLU]), syntactic features such as wh-question constructions, and the frequency of different syntactic categories (Bottema-Beutel & Kim, 2021). The functional aspects, by contrast, encompass the intent behind caregiver language directed at their children.
Numerous studies have explored the structural features of caregiver language input for children with ASD within English-speaking context, yet inconsistencies in findings reflect the complexity of caregiver–child language dynamics, shaped by developmental stage, interaction context, and child characteristics. A longitudinal investigation revealed that caregivers of preschoolers with ASD produced shorter MLU overall during naturalistic free-play interactions, a difference linked to children's nonverbal cognition and diagnostic status—suggesting that caregivers adjust their syntactic complexity in response to children's developmental profiles (Fusaroli et al., 2019). Critically, this study highlighted reciprocal effects, with child speech and caregiver speech predicting one another. Compared with TD children, MLU in the speech of children with ASD was shorter. In contrast, Hutchins et al. (2017) found that during storytelling interactions with school-age children who already possessed language skills, caregivers of children with ASD produced longer utterances than those of TD children when controlling for total amount of talk. These studies illustrate that caregiver language input to children with ASD is not static but dynamically adjusts to children's age, language ability, and interaction goals.
Caregiver language serves as an important and widely employed medium for conveying emotions and information to children. Accordingly, utterances can be mainly categorized into two types based on their functional features: affect-salient speech and information-salient speech (Bloom et al., 1996; Locke, 1996; Luigia et al., 1998). Subsequent studies have included attention getters, which typically involve children's names or nicknames, into the study of language functional features (Venuti et al., 2012). From a developmental perspective, the functional features of caregiver language are of vital importance (Herrera et al., 2004; Luigia et al., 1998). Affect-salient speech, which aims to stimulate children's enthusiasm for communication and interaction, consists of expressive, usually nonpropositional or seemingly meaningless utterances such as encouragement, singing, and greetings (Locke, 1996; Luigia et al., 1998). It may create a secure emotional context that enhances language learning, aligning with attachment theory's emphasis on emotional security as a foundation for exploration (Ainsworth et al., 2014). Information-salient speech, on the other hand, focuses on conveying propositional content about the self, the child, or the environment, including directives, questions, labels, and descriptions (Zampini et al., 2020). Its role in language acquisition is well-documented. For example, between 6 and 18 months, caregivers’ contingent labeling of objects during joint pointing facilitates infants’ word learning (Rowe & Zuckerman, 2016). Among children with ASD, questions and descriptions emerge as critical subcategories. Notably, caregivers of toddlers with ASD use fewer questions but comparable levels of descriptions relative to those of TD children during naturalistic play interactions, suggesting descriptions may serve as a compensatory verbal scaffold to support exploration of internal and external worlds (Venuti et al., 2012). Attention getters represent another functional category, particularly prominent in interactions involving children with ASD (Venuti et al., 2012). This adaptation reflects bidirectional dynamics: children's reduced responsiveness elicits more attention-getting attempts, which may inadvertently reinforce suboptimal interaction patterns. Despite these insights, critical gaps remain. Relatively few studies on functional language in ASD focus on how affect-salient speech, subcategories of information-salient, and attention getters interact to shape the language development environment.
When providing language input, caregivers often use gestures concurrently to emphasize, clarify, or supplement the content of their utterances. When interacting with toddlers of lower language proficiency during storytelling, caregivers increase their use of representational and supplementary gestures—those that add information not present in speech (Molnar et al., 2021). Effective use of gestures by caregivers, particularly when synchronized with speech, is associated with improved social and language outcomes in both neurotypical and neurodivergent children (Choi & Rowe, 2024; Lv et al., 2022). Overall, caregivers’ adaptive use of gestures provides a multimodal scaffold that is especially valuable for children with limited vocabulary or those in linguistically challenging environments. Understanding how caregivers integrate gestures with speech is essential to a comprehensive account of the linguistic environment in which toddlers develop during their early years.
Unlike the extensive research on caregiver language input in English contexts, there are currently only a few studies focusing on such input in Chinese contexts, with a notable scarcity of relevant research. One such study demonstrated that 3- to 4-year-old Chinese children with ASD receive less overall language input during semistructural play sessions compared to their TD peers, with reduced complexity of the language they are exposed to (Xu et al., 2021). In fact, researching caregiver language input in non-English contexts is of great significance. A study investigating caregiver input to children with ASD in Bulgarian and English contexts revealed that parents in these two contexts exhibit differences in the process of language input, such as different percentage questions (Barokova & Tager-Flusberg, 2024). Conducting research only on English-speaking children may overlook the nuances of caregiver input arising from cross-linguistic variations.
Caregiver Language Input and Engagement States of Children
Early language exposure benefits both TD children and those with ASD. Joint engagement—the dynamic state in which a child and caregiver actively share attention to an object or event—serves as a foundational context for language and social communication development (Adamson et al., 2009). For children with ASD, disruptions in joint engagement are well-documented (Adamson et al., 2009; Bottema-Beutel et al., 2014). Early developmental divergences in both social engagement and language among children are thought to influence caregiver input (Kushner et al., 2023). Caregivers adjust their use of labeling based on the engagement state of toddlers with ASD, highlighting that the context of caregiver input (i.e., the engagement state in which it occurs) modulates its effectiveness for toddlers with ASD during free play interactions. Collectively, these findings underscore the necessity of examining caregiver language input within specific engagement states to understand how it supports development in ASD. Furthermore, investigations within the Chinese context can complement existing research findings and enhance understanding of the relationship between caregiver language input and engagement states among children with ASD across diverse contexts.
Caregiver Language Input and Child/Caregiver Characteristics
The more fine-grained aspects of input that matter depend on the child's language ability or age, particularly for those with language delays or ASD (Choi et al., 2020; Fusaroli et al., 2023; Rowe & Snow, 2020). Previous findings revealed that during semistructured play interactions, caregivers of higher-functioning verbal 2- to 9-year-old children with ASD asked more questions, whereas those supporting lower-functioning nonverbal children relied more on directives and used shorter MLU (Konstantareas et al., 1988). This adjustment reflects both responsiveness to the child's needs and potential constraints imposed by limited verbal engagement. Children's age is also a potential influencing factor on caregiver language input. Eighteen months emerges as a critical point in toddlers’ language development, as the quantity and diversity of input they receive at this stage exert a significant impact on their subsequent language development (Gámez et al., 2023; Rowe, 2012).
The caregiver–child interaction is a dyadic process involving both caregivers and children. Children's characteristics may influence caregiver input during the interaction, while caregivers’ own traits and emotions may also shape their interactive behaviors. Studies have found that caregiver stress can reduce the quantity of language input provided, which in turn is associated with poorer language outcomes in both toddlers with ASD and their siblings (Markfeld et al., 2023). Caregivers’ depressive symptoms also affect their perceptions and reports of autistic traits, highlighting the complex interplay between caregiver characteristics and child outcomes (Goh et al., 2018). Beyond emotions, caregiver's autistic traits may influence their interaction style; however, direct research linking caregiver autistic traits to their language input remains limited. When investigating caregiver language input, considering potential influencing factors from both caregiver and child perspectives facilitates a deeper and more accurate understanding of such input.
Extensive research on the early language development environment of toddlers with ASD and their potential influencing factors in English-speaking context has been conducted, laying a solid foundation for this field. However, research in Chinese contexts remains relatively scarce, particularly regarding the caregiver–toddler interaction encompassing diverse engagement states. Additionally, caregivers’ integration of gestures during language input warrants further exploration. A comprehensive and systematic analysis of caregiver language input to Chinese-speaking toddlers with ASD, along with the relationships between such input and child characteristics, not only deepens understanding of their early development environment but also provides a robust theoretical foundation for targeted interventions and educational practices.
In conclusion, this study addresses the following research questions:
Aim 1—In Chinese contexts, do caregivers of toddlers with ASD exhibit unique structural and functional characteristics in their language input during play interactions compared with those of TD toddlers? We hypothesized that caregivers of toddlers with ASD would exhibit shorter MLU, ask fewer questions, and use more attention getters (Fusaroli et al., 2019; Zanchi et al., 2024).
Aim 2—In Chinese contexts, do characteristics of caregiver language input vary with the engagement states of toddlers with ASD? We hypothesized that caregivers of toddlers with ASD would provide more labeling and descriptions during joint engagement (Kushner et al., 2023; Roemer et al., 2022).
Aim 3—In Chinese contexts, does caregiver language input (including gesture integration) vary with toddlers’ spoken language during interactions and their ages? We hypothesized that lack of spoken language during interactions and younger age would be associated with shorter MLU, fewer questions, and more attention getters (Zanchi et al., 2024).
Method
Participants
Participants were 70 caregivers and their toddlers with ASD (
In this study, inclusion criteria were as follows: (a) toddlers aged 12–24 months; (b) native Chinese speakers; (c) caregivers agreed to have videos taken and participate in the evaluation and follow-up. At a mean age of 18.7 months (standard deviation,
Participant Demographics and Descriptive Statistics.
Values are presented as mean (standard deviation) for
**
About 27% of interactive caregivers in this study had an education level below a bachelor's degree, 53% had a bachelor's degree, and 20% had an education level above a bachelor's degree. Additional demographic information is reported in Table 1. All families gave written informed consent. This study was approved by the Institutional Review Board at the Third Affiliated Hospital of Sun Yat-Sen University.
Measures
The Mullen Scales of Early Learning: AGS Edition
The MSEL is a standardized developmental assessment tool for children ranging from birth to 68 months old. It evaluates gross motor skills, visual reception, fine motor skills, receptive language, and expressive language. Upon completion of the assessment, the raw scores in each scale are converted into
The Autism Diagnostic Observation Schedule, Toddler Module
The ADOS-T is used for assessing toddlers aged 12–30 months and consists of two domains: social affect and restricted, repetitive behaviors. The total score is calculated as well. According to the norm, the following ranges of concerns are identified: little to no concern, mild to moderate concern, and moderate to severe concern. The CSS of the ADOS-T is used to assess the severity of individual modules (Esler et al., 2016).
The Autism-Spectrum Quotient
The AQ is a self-report screening tool for the autistic traits of caregivers with normal intelligence. The AQ comprises 50 items, with each item answered on a scale from 1 to 4. Depending on the item, either responses 1 and 2 or responses 3 and 4 are scored as 1 point. The higher the score, the more obvious the autistic traits (Baron-Cohen et al., 2001).
The Patient Health Questionnaire-4
The PHQ-4 is an ultra-brief self-report screening tool for depression and anxiety in caregivers. A score of 3 or higher on the depression subscale is a reasonable cutoff for identifying potential major depressive disorder or other depressive conditions (Kroenke et al., 2003; Löwe et al., 2005). Similarly, a score of 3 or higher on the anxiety subscale serves as a reasonable cutoff value for detecting generalized anxiety, panic, social anxiety, and posttraumatic stress disorder (Kroenke et al., 2007). In this study, we calculated the positive cases of the depression and anxiety subscales respectively.
Coding Scheme of Caregiver–Toddler Play Interaction in a Natural Context
Caregivers were free to choose whether to conduct interactions in their actual home environment or a simulated naturalistic setting, based on their own convenience and preferences. For both contexts, we provided an identical set of age-appropriate toys for caregivers to use as reference, including a toy car, a music box, a bouncing toy, a set of boxes, eight textured blocks, eight building blocks, and eight plastic snowflakes. Videos needed to clearly capture the upper body movements, including facial expressions, of both caregivers and toddlers. All naturalistic caregiver–toddler interactions were carried out by the parent who was the primary caregiver in the toddler's daily life. Caregivers were instructed to play in a natural, everyday manner. Prior to formal recording, caregivers played with toddlers for 2 min as a warm-up. Caregivers could decide freely whether to use toys during play. Face-to-face dyadic interaction was permissible; that was, caregivers and toddlers could interact without toys, engaging instead in games such as peekaboo and lifting the child. Examples of caregiver–toddler play interactions are shown in Figure 1.

Examples of Caregiver-Toddler Play Interaction in a Natural Context. (a) and (b) Respectively Represent Caregiver-Toddler Play Interaction in a Simulated Naturalistic Setting (a) and an Actual Home Environment (b).
Video coding started right after the warm-up ended and lasted continuously for 10 min. If the camera did not clearly capture the toddler's activities or the toddler interacted with someone other than the caregiver, that segment was considered uncodable and removed when calculating the total effective duration of the video (in minutes). Two well-trained researchers identified and transcribed caregiver's utterances and gestures, toddler's engagement states, and toddlers’ spoken language across all video recordings. ELAN 6.8 software was used for video coding.
Caregiver Language Input
All the spoken language of the caregiver directed towards the toddler was cut into single utterances, which represent the unit of analysis. The boundaries between utterances were defined by intonation changes, pauses longer than 1 s, and/or subject changes (Roemer et al., 2022), excluding semantically irrelevant parts or noises. Caregiver language input was analyzed considering functional, morphosyntactic and integrated gesture aspects. Unintelligible or incomplete utterances were excluded from the analysis.
Regarding the functional features of the input, we adopted the same coding scheme used by Zampini et al. (2020). The functions are as follows:
The proportion of each category described above was calculated by dividing the number of utterances in each category by the total number of utterances by caregivers.
Regarding the morphosyntactic features of the input, we considered the following measure:
Toddlers’ Spoken Language During the Interaction
The toddler's spoken language during the caregiver–toddler play interaction is defined as any spoken expression that uses a phonetic symbol system to convey clear information (O’Hare, 2005). In this study, the toddler's spoken language during the interaction was coded into two categories:
With spoken language during the interaction: the toddler produces at least one spoken expression, like “Mama” or “Want,” that can convey clear information during the entire effective coding period. Without spoken language during the interaction: the toddler produces no spoken expressions that can convey clear information throughout the entire effective coding period.
Toddlers’ Engagement States
We used the coding scheme for toddlers’ engagement states developed by Adamson et al. (2004, 2009). After excluding uncodable segments, the entire play video was coded into mutually exclusive engagement states, such that the end of one code marked the beginning of another. Engagement states included unengaged state, object engagement, joint engagement, and onlooking state. The specific definitions of each engagement state are shown in Table 2. To avoid micro-coding of brief fluctuations in attention, a 3-s principle was applied. That is, if the toddler’ gaze briefly shifts away from the interaction for less than 3 s to focus on another object or be distracted by an unexpected noise, it is not coded as a change in the engagement state.
Classification of Toddler Engagement States in Play Interactions.
Data Reduction
After coding, all the original data were processed. We calculated the total frequency of input, the proportion of utterances in each category, the proportion of utterances integrated with each gesture category, and the proportion of the engagement state in each category. Furthermore, within each of the toddlers’ engagement states, the proportion of each utterance category were calculated.
Training and Coding Reliability
Two doctoral students with professional backgrounds in child developmental behavior participated in the coding process of this study. To ensure coding consistency, the two coders conducted multiple rounds of pilot coding prior to the official coding phase. In each round of pilot coding, they independently completed full coding of a randomly selected interaction video. Official coding commenced once intercoder reliability exceeded .90. For any discrepancies, the coders reviewed the relevant segments repeatedly and only included them in the analysis after reaching a consensus. After ensuring coding reliability with training videos, the two coders independently coded the videos, and neither of them knew the diagnosis and ability information of the toddlers. Due to the involvement of variables at multiple levels, each video was processed 3 times. First, we coded the toddlers’ engagement states and excluded uncodable segments. Second, we coded all utterances of caregivers within the valid coding segments, as well as gestures integrated with these utterances. Finally, we determined whether the toddlers produced spoken language throughout the valid interaction segments. Twenty percent of the caregiver–toddler play interaction videos (
Data Analyses
We used IBM SPSS 26.0 for all statistical analyses and GraphPad Prism 10.4 to create most graphs. Alpha was set at .05 for statistical significance. Proportions with a denominator of 0 were treated as missing values in the analysis. Outliers exceeding three standard deviations were excluded from the analysis. The normality was assessed using the Shapiro–Wilk (S-W) test, and the corresponding inter-group comparison method was selected based on variable distribution. To test for differences in sample variables between the two groups, we used the
Comparison of Caregiver Language Input Characteristics Between ASD and TD Groups.
Frequency is calculated by dividing the number of utterances by the total effective duration of the video (in minutes).
Proportion of utterances in each category of functional speech (calculated by dividing the number of utterances in each category by the total number of utterances produced by the caregivers).
*
Caregiver Language Input to Toddlers With ASD: Comparison Between Those With and Without Spoken Language During Interactions.
Values are presented as mean (standard deviation) for
Proportion of utterances in each category of functional speech (calculated by dividing the number of utterances in each category by the total number of utterances produced by the caregivers).
Caregiver Language Input to Toddlers With ASD: Comparison Between 12–18-Month-Old and 19–24-Month-Old Toddlers.
Values are presented as mean (standard deviation) for
Proportion of utterances in each category of functional speech (calculated by dividing the number of utterances in each category by the total number of utterances produced by the caregivers).
*
Results
Sample Demographics and Descriptive Statistics
Table 1 presents the demographic and descriptive data of the ASD and TD groups. In total, 40 males and 30 females were enrolled in the study, with 30 males and nine females in the ASD group. According to Mullen's definition of developmental delay (
Aim 1—Caregiver Language Input Provided to Toddlers With ASD Versus TD Toddlers
Table 3 shows the characteristics of caregiver input in the ASD and TD groups. The proportion of questions and attention getters differed between the two groups. Specifically, the proportion of questions was significantly higher for TD toddlers than for toddlers with ASD (
Aim 2 – Caregiver Language Input Within Different Engagement States of Toddlers With ASD
We compared the proportions of utterances across functional categories within each engagement state. Within the unengaged state, affect-salient speech accounted for the lowest proportion, with significant differences from directives (
Comparison of Proportions of Six Caregiver Functional Utterance Categories Within Four Engagement States of Toddlers With ASD.
Kendall's W, a statistic ranging from 0 to 1, was used to quantify the degree of intercaregiver agreement in these proportional distributions. Specifically, values < .3 indicate low consistency, values between .3 and .7 (inclusive) reflect moderate consistency, and values > .7 denote high consistency in the relative distribution patterns of the six utterance categories.
**
We also compared the proportions of utterances across functional categories within each engagement state of TD toddlers (see Appendices S2 and S3).
Aim 3—Caregiver Language Input and Toddlers’ Individual Characteristics
In the present study, there were no intergroup differences in AQ scores or PHQ-4 scores between caregivers of toddlers with ASD and their TD counterparts. Therefore, we focused on the relationship between toddlers’ individual characteristics and caregiver language input.
Toddler Spoken Language During the Interaction
As reported in Table 4, no significant differences were observed in the characteristics of caregiver input to toddlers with ASD who had spoken language and those did not during the interaction.
Within the TD group, the sample size of the subgroup who did not have spoken language during the interaction was only 6. Given that the statistical power of a small sample may be insufficient to yield reliable conclusions, this study did not conduct intergroup comparisons of caregiver language input between toddlers who had spoken language during the interaction and those did not within the TD group.
Toddler Age
We examined differences in the characteristics of caregiver language input between 12–18-month-old and 19–24-month-old toddlers in the ASD and TD groups to determine whether toddlers’ age affected such input.
As shown in Table 5, in the ASD group, the proportions of disambiguating gestures differed between age groups. Disambiguating gestures were more common in input to 19–24-month-olds (
Discussion
This study explored the unique characteristics of caregiver language input during 10-min naturalistic play interactions with toddlers with ASD in Chinese contexts. It also analyzed the functional features of such input within different engagement states of toddlers with ASD. Moreover, this study explored the association between caregiver language input and toddlers’ individual characteristics. Key findings are as follows: (a) Chinese-speaking caregivers of toddlers with ASD used a lower proportion of questions and a higher proportion of attention getters during play interactions compared to those of TD toddlers; (b) caregivers adjust language input based on the engagement states of toddlers with ASD; (c) there are no significant differences in caregiver language input when interacting with toddlers with ASD who have or lack spoken language during the interaction; (d) when interacting with 12–18-month-old toddlers, whether with ASD or TD, caregivers use a lower proportion of disambiguating gestures compared to those interacting with 19–24-month-old toddlers.
Caregiver Language Input Provided to Toddlers With ASD Versus TD Toddlers
Previous research on language acquisition in atypical populations have found that mothers adjust their communication styles based on their children's language and cognitive capabilities (Boyce & Boyce, 2002; D’Odorico & Jacob, 2006; Venuti et al., 2012). This study expands the understanding of the characteristics of language input by caregivers of toddlers with ASD in Chinese contexts.
The primary aim of this study is to compare the language input of caregivers of toddlers with ASD and TD toddlers. Consistent with our hypothesis, caregivers of toddlers with ASD posed fewer questions and more attention getters. This findings aligns with the characteristics of language input by caregivers of children with ASD in English-speaking contexts (Venuti et al., 2012), suggesting that caregivers from different cultural backgrounds exhibit similar traits when interacting with children with ASD. Such adjustments in language input demonstrate cross-linguistic stability. Questions typically initiate conversations and encourage language output. Venuti et al. (2012) discovered that when interacting with toddlers with language development delays, caregivers tend to decrease open-ended questions and rely more on direct directives to sustain interactions. The low responsiveness of toddlers with ASD to social interactions may prompt caregivers to adopt more “controlling” strategies. Such caregivers may seek to regulate the situation by excessively structuring their toddlers’ behaviors and providing simplistic, repetitive cues. The high-proportion use of attention getters represents caregivers’ adaptive adjustment to the insufficient social interactions of toddlers with ASD. Contrary to our expectations, caregivers of toddlers with ASD and those of TD toddlers exhibited similar MLU in our study. Although caregivers of toddlers with ASD did have shorter MLU than their TD counterparts, this slight difference did not reach statistical significance. This result is inconsistent with previous findings (Britsch & Iverson, 2024). Given that toddlers with ASD often display delays in language comprehension, caregivers typically simplify syntactic structures to match these toddlers’ cognitive and language levels. One potential explanation for our results may lie in differences in MLU calculation methods between Chinese and English. Additionally, future research should further explore MLU among Chinese caregivers using a larger sample size.
Despite significant differences, this study also uncovered several similarities in specific utterance characteristics between Chinese-speaking caregivers of the ASD and TD groups. This finding aligns with numerous previous studies in English-speaking contexts (Bottema-Beutel & Kim, 2021). The cross-linguistic consistency of such similarities suggests they are not arbitrary but reflect evolutionarily rooted or developmentally tailored strategies. Caregivers, regardless of a toddler's diagnostic status or cultural context, intuitively deploy utterances that align with foundational learning needs. Such similarities may serve as a scaffold, facilitating interactions and language development. In studies on TD toddlers, caregivers’ use of labeling enables toddlers to associate words with objects, thereby supporting language acquisition (Yu et al., 2019). When toddlers focus on objects, caregivers’ use of labeling can capture their attention and promote caregiver–toddler interactions (Bottema-Beutel et al., 2018). This facilitation is particularly beneficial for toddlers with ASD, who struggle with engaging in interaction. In both the ASD and TD groups, caregivers’ utterances during the interactions are mainly information-salient speech, which is consistent with the language patterns expected of mothers of 2-year-old toddlers (Bornstein et al., 1992; D’Odorico et al., 1999). At this stage, the language communication between mothers and toddlers focuses more on sharing internal and external world meanings rather than emotional expression (Venuti et al., 1997).
During language input, caregivers often integrate gestures to help toddlers understand their utterances. Among the three types of gestures, reinforcing gestures are most common in both groups, typically used to indicate the attention focus in utterances. Although toddlers with ASD exhibit fewer social responses than TD toddlers during caregiver–toddler interactions, their caregivers still use gestures effectively to enhance their social responses. A small-scale study even revealed that, compared to the TD group, caregivers of children with ASD produced more gestures and provided more scaffolding for their children's visual experiences (Yoshida et al., 2020). These findings collectively underscore caregivers’ ability to adapt their language to meet their toddlers’ unique needs (Emiddia Longobardi & Cristina Caselli, 2007).
Caregiver Language Input Within Different Engagement States of Toddlers With ASD
Our second aim is to determine whether caregiver language input varies according to the engagement states of toddlers with ASD in Chinese contexts. The input proportions of the six utterance categories in this study show different trends within different engagement states. Affect-salient speech has a consistently low proportion in all engagement states of toddlers with ASD, except for joint engagement. This indicates that when toddlers are inattentive to caregivers, emotional expression is not a key choice for interactions, and caregivers only increase its use in specific scenarios. Directives have a relatively high proportion in the unengaged, object engagement, and joint engagement states, reflecting the need for caregivers to guide the behaviors of toddlers in these three states. The directive style adopted by caregivers of toddlers with developmental disabilities has long been a debated topic. Most studies suggest that when caregivers exhibit a more directive style during the interactions, toddlers show lower engagement, such as having shorter conversation turns (Smith et al., 2018). Nevertheless, some studies contend that caregivers’ use of follow-in directives benefits toddlers’ language outcomes (Delehanty et al., 2023). In this study, caregivers’ directive utterances encompass both appropriate structured guidance (e.g., “Throw the ball like this”) and restrictions (e.g., “Don’t move around”). Yet, this study did not discern whether these utterances follow toddlers’ attention. Future research could refine the classification of utterances to better clarify the relationship between them and toddlers’ attention foci. This would more clearly define the role of directives in caregiver–toddler play interactions in Chinese contexts. Labeling has a small proportion in all states. This difference is particularly obvious in the joint engagement state, indicating that caregivers use this simple form of utterance relatively less when toddlers are actively engaged in the interaction. The proportion of questions remains stable in different states, but its ranking relative to other utterance categories varies, which reflects that caregivers adjust question-asking according to the interaction context. Descriptions have a relatively high proportion when toddlers are engaged in interaction. This demonstrates the importance of descriptive utterances in the entire caregiver–toddler play interaction. When caregivers use rich descriptive expressions, toddlers are exposed to a broader range of vocabulary and language constructs. This exposure, in turn, enhances toddlers’ language comprehension and expression (Christakis et al., 2019). The proportion of attention getters changes irregularly. Caregivers mainly use them to attract toddlers’ attention when toddlers are unengaged. Caregivers of TD toddlers tend to use less affect-salient speech, labeling, and attention-getters, while employing more questions and descriptive utterances. These differences in the characteristics of language input between caregivers of toddlers with ASD and those of TD toddlers indicate that caregivers adapt their language input strategies according to children's performance in interactions. Overall, these findings illustrate the dynamic and context-dependent nature of caregiver language input in Chinese-speaking contexts, highlighting how Chinese caregivers adapt their utterance strategies based on the engagement states of toddlers with ASD. When facing toddlers with ASD who have more complex and diverse engagement states, the language input of caregivers also shows more variations. Our results supplement the understanding of the impact of toddlers’ various engagement states on caregiver language input in the Chinese context.
Caregiver Language Input and Toddlers’ Individual Characteristics
Existing studies have demonstrated a significant reciprocal relationship between caregiver language input and toddlers’ language skills during the first 2 years of life (Choi et al., 2020). This bidirectional relationship is marked by dynamic adaptation: toddlers’ early MLU shapes caregivers’ future linguistic complexity (Smith et al., 2023), while caregivers’ vocabulary and syntax align with those of preschoolers across extended periods, maintaining stable congruence over six assessments spanning two years (Fusaroli et al., 2019). Such synchrony reflects that both caregivers and toddlers continuously adjust to each other's linguistic cues to sustain communicative flow. Nevertheless, in this study, no significant differences in language input characteristics were observed between the caregivers of toddlers with ASD who produced spoken language during the interaction and those who did not. This might be attributable to the fact that, despite variations in spoken language during the interaction of toddlers with ASD in Chinese contexts, during their second year of life, even those with spoken language capability typically only reach the single-word stage, with a limited vocabulary and low vocalization frequency. Therefore, this subtle distinction is insufficient to exert a significant impact on caregiver language input. On the other hand, toddlers with ASD often display language delays of varying degrees and manifestations (Vogindroukas et al., 2022). Although caregivers generally prioritize toddlers’ language development, they may lack sensitivity to subtle changes in toddlers’ language abilities, failing to promptly adjust their language input strategies to align with the toddlers’ evolving needs. This finding underscores the importance of supporting Chinese caregivers of toddlers with ASD in daily communication and across the toddlers’ long-term development. It is critical to guide caregivers to focus on the skills toddlers have already demonstrated and make targeted strategic adjustments, rather than solely addressing foreseeable difficulties.
This study reveals that, regardless of toddlers’ diagnoses, caregivers exhibit commonalities in the gestures integrated during utterance input across toddlers of different age groups. When interacting with toddlers over 18 months of age, caregivers use more disambiguating gestures. This may be due to the increased use of pronouns, and caregivers add disambiguating gestures to ensure semantic clarity. The similar trend in gesture changes among caregivers of the ASD and TD group indicates that caregivers adjust their interaction strategies as toddlers age, and caregivers of both groups exhibit parallel adjustment patterns. This reflects the shared expectations of Chinese caregivers regarding toddlers’ developmental abilities.
In summary, this study represents the pioneering effort to systematically examine the characteristics of caregiver language input during caregiver–toddler play interactions in natural Chinese-language settings. By concentrating on the functional dimension, it explores the correlation between toddlers’ diverse engagement states and the characteristics of caregiver language input. These results have deepened the understanding of the early language development environment of Chinese toddlers with ASD and supplemented previous studies in English-speaking contexts. Our findings indicate that the characteristics of caregiver language input are closely associated with toddlers’ performance during interactions, consistent with the bidirectional nature of caregiver–toddler interactions. When caregivers of toddlers with ASD adjust their language strategies during interactions, there are positive aspects (e.g., frequent use of descriptions and gestures) but also suboptimal adaptations (e.g., excessive reliance on attention-getting utterances to enhance toddlers’ engagement). Additionally, the potential implications of caregivers’ infrequent use of questions for toddlers with ASD warrant further investigation. These characteristics could serve as targets for early intervention. By focusing on naturalistic caregiver–toddler play interactions, we can guide Chinese caregivers to adopt more optimal and effective strategies, thereby facilitating toddlers’ long-term development. Specific interventions targeting caregiver–toddler dyadic interactions help caregivers identify which behaviors and language successfully regulate interactions with their toddlers, as well as those that yield little benefit. Such awareness may enable Chinses caregivers to persist during unsuccessful interactions and develop strategies better suited to their toddlers’ unique needs.
While this study yielded important findings on caregiver language input to toddlers in Chinese-speaking contexts, it had several limitations. First, a larger sample size would enhance the generalization of the findings. Due to sample size constraints, the male-to-female ratios within the ASD and TD groups were not balanced. This imbalance somewhat undermined result comparability. It also undermined the statistical power of the multiple group comparison results. Additionally, over half of the caregivers in this study held at least a bachelor's degree, potentially limiting the generalizability of the findings. Finally, the interaction context affects caregiver language input (Thompson et al., 2024), yet this study only focused on play contexts. To better understand caregiver language input characteristics, future research could explore language input patterns during various daily routines such as dining, dressing, playtime, and bath time.
Conclusion
This study explored the unique characteristics of caregiver language input during 10-min naturalistic play interactions with toddlers with ASD in Chinese-speaking contexts. Key findings include: (a) Chinese caregivers of toddlers with ASD used a lower proportion of questions and a higher proportion of attention getters during play interactions compared to those of TD toddlers; (b) caregivers adjust language input based on the engagement states of toddlers with ASD; (c) there are no significant differences in caregiver language input when interacting with toddlers with ASD who have or lack spoken language during the interaction; (d) when interacting with 12–18-month-old toddlers, whether with ASD or TD, caregivers use a lower proportion of disambiguating gestures compared to those interacting with 19–24-month-old toddlers. These findings indicate that caregiver language input in Chinese contexts is dynamic and context sensitive. The behavioral characteristics of Chinese toddlers with ASD might shape the unique characteristics of caregiver language input.
Supplemental Material
sj-docx-1-dli-10.1177_23969415251389128 - Supplemental material for Caregiver Language Input in Different Engagement States During Play Interactions With Toddlers With Autism: An Observational Study
Supplemental material, sj-docx-1-dli-10.1177_23969415251389128 for Caregiver Language Input in Different Engagement States During Play Interactions With Toddlers With Autism: An Observational Study by Yijie Li, Shaoli Lv, Linru Liu, Leran Xue, Huishi Huang, Yu Xing, Qianying Ye, Feixia Zhang and Hongzhu Deng in Autism & Developmental Language Impairments
Footnotes
Acknowledgments
Ethical Approval and Informed Consent Statements
Consent for Publication
Author Contributions
Funding
Declaration of Conflicting Interest
Data Availability Statement
Supplemental Material
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
