Abstract
1 Introduction
Language comprehension involves the simultaneous processing of multiple strands of linguistic information, including syntax, lexical stress, and acoustic detail of the speech signal (Friederici, 2011; Hagoort, 2008; Martin, 2016). While these different strands of information can interact even at the earliest stage of processing, they are usually studied in isolation in electrophysiological studies of language processing. For example, Mismatch Negativity (MMN) studies of automatic processing of syntax compare syntactically different sentences that are acoustically and rhythmically similar. Similarly, in MMN studies of automatic processing of lexical stress, a comparison is made between isolated word pairs that have different stress patterns but are otherwise identical. As a result, relatively little is known about what happens when these linguistic factors covary, as occurs in natural language. It is, however, of great importance to examine how linguistic factors interact during language processing, even at the earliest stages of information extraction, as this brings us closer to understanding the natural course of language comprehension in everyday life. Against this background, the current MMN study examines the automatic processing of subject–verb agreement in Dutch when the sequences do not only differ in grammaticality but also in the lexical stress pattern of the verb.
The MMN is an attention-independent negative event-related potential (ERP) effect and is typically elicited by deviant (“oddball”) stimuli in a passive oddball paradigm, in which frequent auditory stimuli (“standards”) are occasionally alternated with infrequent auditory stimuli (“deviants”) (Näätänen et al., 2007). While the MMN was initially considered an index of automatic acoustic change detection (Näätänen & Winkler, 1999), later studies showed that it is also sensitive to higher-order linguistic factors such as syntax and semantics, thus reflecting experience-dependent long-term memory traces (Pulvermüller & Shtyrov, 2006; Shtyrov et al., 2003).
In an MMN study on automatic processing of grammatical agreement, Pulvermüller and Shtyrov (2003) examined the processing of grammatical and ungrammatical English subject–verb sequences (
Notably, these studies all manipulated syntactic correctness while keeping prosodic properties of the stimuli constant. One such prosodic property that can covary with syntax and is also subject to automatic processing is lexical stress. In passive oddball experiments, deviant stimuli whose stress pattern differs from that of the standards also elicit an MMN. For example, Weber et al. (2004) looked at the MMN responses to deviants with trochaic stress (strong-weak) presented among standards with iambic stress (weak-strong) and vice versa. The dominant stress pattern in German is trochaic, but iambically stressed words are nevertheless present in German. Weber and colleagues found that, in adults, both deviants elicited a qualitatively similar MMN, regardless of whether they had trochaic or iambic stress (for similar findings in Dutch, see Emmendorfer et al., 2020). In this study, however, it is unclear whether the MMN is triggered by a difference in stress pattern or a difference in the duration of the first syllable (i.e., stressed syllables have a longer duration). In a related study using disyllabic pseudowords in English, a free-stress language, Peter et al. (2012) presented standards which varied in absolute duration but not in the relative duration of the first and the second syllable (e.g., the pattern of the standards in a given block was always short-long). The deviants also varied in absolute duration, but their relative duration pattern was reversed (e.g., long-short). Interestingly, trochaic deviants elicited an MMN, but iambic deviants did not. Because of the variance in the absolute duration of the standards, this difference in MMN is more likely to be driven by the abstract stress pattern of the words than by their acoustic properties. More specifically, Peter et al. (2012) attributed this finding to the listeners’ familiarity with the trochaic stress pattern in English, arguing that the existing memory trace for iambic stress patterns in English listeners might be weaker than that for the trochaic stress pattern, leading to a reduced difference between the ERPs to trochaic standards and iambic deviants and hence a diminished MMN.
The above reviewed studies looked at free-stress languages, in which both trochaic and iambic stress are legal and thus present. This is not the case in languages with a fixed-stress pattern, in which stress always falls on the same syllable. MMN studies in these languages show, among other things, that the processing of lexical stress is influenced by the lexical status of the stress-bearing words. In a study in Finnish, a fixed-stress language with stress on the word-initial syllable, Ylinen et al. (2009) found that the MMN in response to real word deviants with an illegal stress pattern was delayed compared with the MMN in response to real word deviants with the legal stress pattern, suggesting that processing the illegal stress pattern is computationally demanding. However, as the standard stimuli in their experiment were always pseudowords with a legal stress pattern, the response to illegally stressed real word deviants might reflect both lexical and prosodic differences. That lexical familiarity indeed interacts with the processing of lexical stress is suggested by MMN studies in Hungarian, also a fixed-stress language with word-initial stress. When testing pseudowords only, Honbolygó and Csépe (2013) found that the deviant with an illegal stress pattern elicited an MMN, whereas the deviant with a legal stress pattern did not (see also Honbolygó et al., 2020). In contrast, in a follow-up study using real words, MMNs were elicited by both legally and illegally stressed words (Garami et al., 2017).
In sum, past MMN studies have shown that the brain automatically processes information about both syntax and lexical stress. Moreover, the neural response to variation in stress patterns is modulated both by the frequency of those stress patterns in the listener’s native language and by whether the stimuli are real words or pseudowords. It remains unclear, however, whether lexical stress also interacts with syntax at this very early stage of processing, since no study has manipulated both factors in a single experiment.
Such an interaction between prosodic cues and morphosyntactic processing is suggested by a recent series of ERP studies conducted in standard Swedish. In this variety of Swedish, each word has a bitonal lexical pitch accent, so word stem tones constitute strong predictive cues to the upcoming grammatical suffix (Roll et al., 2013; Söderström, Horne, & Roll, 2017a). These studies found that when the word stem tone incorrectly cued a suffix, the suffix elicited a P600 effect, sometimes preceded by a Left Anterior Negativity (Roll et al., 2010, 2013; Söderström et al., 2017a, 2017b). This Left Anterior Negativity was taken as an indicator of morphosyntactic processing, possibly reflecting the activation of an unprimed memory trace of a suffix (Söderström et al., 2017a). In light of these findings, the authors argued that listeners can use neural tone–suffix connections to rapidly pre-activate upcoming suffixes based on tonal information. It remains to be investigated whether automatic processing of lexical stress also influences early syntactic processing, in particular in a language in which these strands of information are not predictively cued by one another.
1.1 The present study
The foregoing literature review reveals two main findings. First, manipulations of both grammatical agreement and lexical stress independently modulate MMN amplitude, indicating that syntactic and prosodic information are both processed automatically at a very early stage in the processing stream. Second, prosodic information affects subsequent processing of morphosyntactic information, as evidenced by a modulation of the LAN and the P600. In the present study, we investigate whether and how variations of lexical stress modulate automatic processing of grammatical agreement as indexed by the MMN.
We adapted Pulvermüller and Shtyrov’s (2003) oddball paradigm by introducing an additional difference between the grammatical subject–verb sequences (
The findings reviewed above suggest two competing hypotheses. On one hand, in line with Pulvermüller and Shtyrov (2003) and similar studies, we hypothesize that automatic processing of subject–verb agreement proceeds independently of the processing of lexical stress. We predict that the MMN in response to the grammatical deviant (
2 Methods
2.1 Participants
Twenty-nine right-handed native speakers of Dutch participated in the experiment (18–29 years; mean age = 22 years; 21 females). Written consent was obtained from all participants after they had been informed about the study, along with confirmation that they had no hearing deficits and were not dyslectic. The participants received a fee for their participation.
2.2 Stimuli
Two Dutch subject–verb sequences were used as stimuli, consisting of a personal pronoun and a verb. The personal pronoun

Acoustic waveforms and corresponding spectrograms of the four stimuli used in the experiment. The verbs were identical up to the divergence point, which was placed between the verb stem (
A male native speaker of Dutch recorded the two pronoun–verb combinations at a sampling rate of 44.1 kHz and a resolution of 16 Bit. The most clearly articulated phrases were selected and subjected to speech manipulation in Praat (Boersma & Weenink, 2014) to ensure that both types of pronoun–verb combinations were acoustically identical up to the crucial point of grammatical deviation (i.e., divergence point). Specifically, the verb stem of
The nonlinguistic sound, hereafter referred to as
2.3 Design
An auditory passive oddball paradigm was used, in which each of the four stimulus sequences appeared once as the deviant and once as the standard stimulus. Each standard–deviant combination was presented in a block of 250 stimuli (210 repetitions [84%] of standard and 40 repetitions [16%] of deviant). A total of 1,000 stimuli were presented to the participants. In block 1,
2.4 Procedure
This study was conducted following the ethical guidelines of the Utrecht University Institute of Linguistics. Each participant was tested individually. They were seated in front of a computer screen and watched a silent movie while the stimuli were auditorily presented over Tangent Evo E4 audio speakers, which were placed approximately 70 cm from the participant. They were instructed to watch the video and ignore the auditory signals to minimize the possibility that they would consciously and strategically approach the stimuli. The electroencephalogram (EEG) was recorded using 64 Ag/AgCl electrode BioSemi caps with standardized 10–20 configuration and a BioSemi ActiveTwo system at a sampling rate of 2048 Hz. We used two additional electrodes at the mastoids for referencing and six bipolar electro-oculogram electrodes to record horizontal and vertical eye movements. The recording session lasted 1 hour on average.
2.5 EEG processing
Preprocessing of the EEG data was performed in BrainVision Analyzer (version 2.1; Brain Products, Munich, Germany). The data were re-referenced offline to the average of the two mastoids, band-pass filtered at 0.1–35 Hz (24 dB/oct) and downsampled to 500 Hz. Epochs ranging from –200 to 700 ms were extracted relative to the divergence point, which was chosen as the zero point because the standard and deviant in each block were identical up to that point. 1 ERPs were normalized to a 200 ms baseline window preceding the divergence point. We used an Ocular Correction transform based on Independent Component Analysis to filter artifacts resulting from eye blinks (vEOG) and eye movements (hEOG). Finally, we rejected individual channel-segment pairs which contained artifacts with an amplitude exceeding ± 75 μV, displaying a voltage step of 50 μV or more between two neighboring sampling points, or in which the difference in signal activity was lower than 0.5 μV in an interval of 100 ms (excluding <1% of the data).
3 Results
For each channel, average MMNs were computed by subtracting, for each unique stimulus, the ERP response to the standard from the ERP response to the deviant.
2
For example, the MMN to
Given that the MMN peaks around 150 ms after deviance onset and has a frontal distribution, we performed statistical analysis on the average activity within a frontal region of interest (frontal electrodes FCz, FC1, FC2, Fz, F1, F2, AFz, AF3, and AF4) and within a time window ranging from 100 to 200 ms after deviance onset. A repeated measures ANOVA was conducted to assess the effects of Form (

Grand average ERPs in the frontal region of interest for the Mismatch Negativities (MMNs) to
Given the absence of an effect of linguistic context in the hypothesis-driven spatiotemporal region of interest (ROI), we further explored whether there were any effects beyond this specific ROI. As an exploratory analysis, we ran a cluster-based permutation test (Maris & Oostenveld, 2007) in the entire 0–700 ms time window and across all electrodes, testing the effects of Context, Form, as well as their interaction (i.e., difference between the MMNs to
4 Discussion
In this EEG study, we aimed to find out whether lexical stress affects automatic processing of grammatical agreement, as indicated by modulations of the Mismatch Negativity (MMN). We compared MMNs to deviants that were grammatical and had the typical stress pattern in Dutch for a sequence of plural pronoun and verb (
Previous ERP studies have shown that metrical variation affects syntactic processing at a later stage of processing, as evidenced by the fact that the P600 response to syntactically difficult structures is modulated by metrical violations (Schmidt-Kassow & Kotz, 2008 ) and by differences in rhythmic regularity (Roncaglia-Denissen et al., 2013). The P600 is often interpreted as reflecting a late stage in the processing stream where multiple sources of linguistic information are integrated (Kaan & Swaab, 2003) and is highly sensitive to task demands, such as the presence of a judgment task (Kolk et al., 2003; Schacht et al., 2014). Our findings suggest that lexical stress can affect syntactic processing even at a very early stage of information extraction when the participants do not pay attention to the stimuli.
It should be noted that our results do not indicate that lexical stress interacted with grammatical agreement, but rather that the latter might be insufficiently processed due to the variation in lexical stress. If the processing load induced by prosodic information is the same in the deviant as in the standard, as was the case in previous syntactic MMN studies (Brunellière et al., 2006; Menning et al., 2005; Pulvermüller & Shtyrov, 2003; Pulvermüller et al., 2008; Shtyrov et al., 2003), processing of grammatical agreement might be unaffected. In contrast, when the deviant has a stress pattern different from the standard, as in our study, this may dominate the processing, leaving no resources for automatic processing of grammatical agreement at the same time. This may especially be the case in a passive listening MMN paradigm, where attention is distracted from the input and processing resources are limited.
Although we interpret our findings as an effect of variation in lexical stress in the stimuli, there is an important caveat to this interpretation. The metrical difference between the verb forms is confounded by a phonological difference that we initially did not consider but that was brought to our attention during the review process. That is, the verb forms also differ in syllabic structure or number of syllables: the plural form
An alternative way to interpret our results is to look at processing at a level higher than the verb forms. The presence of an effect of Form indicates that, in both a linguistic and a nonlinguistic context, processing
5 Conclusions
While more conclusive evidence for the influence of the processing of lexical stress on early processing of grammatical agreement awaits further research, this study shows that early processing of grammatical agreement is affected by the processing of phonological properties of the speech signal. This casts doubt on the generalizability of previous MMN studies, which have studied the processing of different strands of information in isolation. Further research on how early processing of information affects processing at later stages is of vital importance to gaining better understanding of the natural course of language comprehension in everyday life.
Supplemental Material
sj-docx-1-las-10.1177_00238309221098116 – Supplemental material for Processing of Grammatical Agreement in the Face of Variation in Lexical Stress: A Mismatch Negativity Study
Supplemental material, sj-docx-1-las-10.1177_00238309221098116 for Processing of Grammatical Agreement in the Face of Variation in Lexical Stress: A Mismatch Negativity Study by Cas W. Coopmans, Marijn E. Struiksma, Peter H. A. Coopmans and Aoju Chen in Language and Speech
Footnotes
Funding
Supplemental material
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
