Abstract
Introduction
Understanding speech in noise is a critical and challenging task for hearing-aid users (Kochkin, 2002). A common method to evaluate this ability is by measuring the speech reception threshold (SRT), a behavioral measure typically defined as the signal-to-noise ratio (SNR) at which a participant can accurately repeat 50% of the presented speech items (SRTbeh). This approach, however, relies on active participation of the listener, which might not always be feasible.
To address this limitation, recent research has explored the use of electroencephalography (EEG) measured while the subjects are listening to speech-in-noise stimuli to estimate the SRT (SRTneuro) (Borges et al., 2025; Lesenfants et al., 2019). This approach offers a promising alternative for assessing speech intelligibility without relying on active behavioral responses.
Researchers have estimated the SRTneuro with scalp EEG and linear models in younger normal-hearing (YNH) listeners. Both encoding (forward) and decoding (backward) models have been used to estimate SRTneuro (Borges et al., 2025; Lesenfants et al., 2019; Vanthornhout et al., 2018), with decoders offering robust speech envelope reconstruction and encoders providing spatial and temporal insights through the temporal response function (TRF) (Alickovic et al., 2019). The TRF is similar to the event-related potential (ERP). However, unlike ERPs, which average responses over multiple repetitions of discrete stimuli (Luck, 2014), the TRF can be calculated for continuous and non-repeated stimuli and reflects only the selected stimulus feature, not the entire stimulus. A decoder model that reconstructs the speech envelope from EEG was used to estimate the SRTneuro as the midpoint of a sigmoid function fitted to SNR-versus-reconstruction-accuracy datapoints (Borges et al., 2025; Vanthornhout et al., 2018). This approach is inspired by a standardized method to determine the SRTbeh. In a previous study using whole scalp EEG and this method, an SRTneuro within 3 dB of the SRTbeh was achieved for 100% of participants and within 2 dB for 75% (Borges et al., 2025).
Existing studies have focused on younger adults. Since age-related hearing loss is the most common type of hearing impairment (Gratton & Vázquez, 2003), it is relevant to investigate the SRTneuro estimation in older adults. Age-related changes include enhanced neural tracking of speech, as evidenced by increased envelope reconstruction accuracy from EEG (Karunathilake et al., 2023; Presacco et al., 2016). These age-related neural changes could potentially bias the SRTneuro estimation or challenge the method in other ways.
Importantly, SRTneuro offers advantages beyond traditional behavioral testing. It enables a more automated, passive estimation procedure, which can be useful for non-responsive individuals such as younger children or individuals with cognitive decline. Furthermore, continuous measurement of speech intelligibility could support adaptive hearing-aid technologies or enable real-time monitoring of listening conditions. However, the use of full-scalp electrodes is impractical for integration into hearing aids. To address this limitation, a recent study investigated the feasibility of measuring SRTneuro using a more discreet and unobtrusive approach: electrodes placed in and around the ear in YNH individuals (Borges et al., 2024). It was found that SRTneuro estimates derived from the electrode configuration of in-ear EEG and 4 electrodes around the ear closely matched the SRTbeh, with results similar to SRTneuro estimates obtained based on a full-scalp array with 66 electrodes. This finding is an important step in the direction of potential integration of SRTneuro measurements into hearing- aids and other ear-worn devices.
Despite these advances the feasibility of using scalp and ear-EEG for SRTneuro estimation in older normal-hearing (ONH) individuals remains unexplored. The present study aims to fill this gap by evaluating SRTneuro estimation in a ONH group using both scalp and ear-EEG configurations, and compare results with those obtained for a YNH group (Borges et al., 2025). A secondary objective was to investigate noise-induced changes in the TRF in ONH participants in order to better understand the neurological changes that could impact the SRTneuro estimation.
Materials and Methods
Participants
In this study, a new data set was collected from 22 ONH participants (16 females, 6 males), aged 57 to 76 years (mean age: 65 years) and combined with an existing data set from 20 YNH participants, originally collected for another study (Borges et al., 2025). The inclusion criteria for the ONH individuals were identical to those used for the YNH study: right-handed, no significant dyslexia affecting daily life, and no history of neurological disorders (Borges et al., 2025), with the exception of the age and hearing threshold requirements. The YNH population was required to be between 18 and 30 years old and have normal hearing defined as pure-tone thresholds of max 20 dB HL (dB hearing level) at 0.125, 0.25, 0.5, 0.75, 1, 1.5, 2, 3, 4, 6, and 8 kHz. The ONH participants were required to be above 50 years old and the ear with the lower threshold (i.e., the “better” ear) was required to meet the following criteria: a maximum threshold of 25 dB HL in the frequency range of 0.125 to 3 kHz, 30 dB HL at 4 kHz, 40 dB HL at 6 kHz, and 45 dB HL at 8 kHz. The other ear was allowed to have a 10-dB higher maximum threshold. The study was approved by Aarhus University's Institutional Review Board with the approval number 2023-014.
Experimental Setup
The experimental paradigm and setup followed the protocol outlined in the previous study (Borges et al., 2025). The study consisted of three visits. During the first visit, the SRTbeh was estimated, the Edinburgh Handedness Inventory test (Oldfield, 1971) was conducted, a reading span test (Daneman & Carpenter, 1980) was performed, and ear impressions were taken. The second visit involved data collection for a separate study and is therefore not further described here. The third visit involved EEG recordings for the current study.
Behaviorally Estimated Speech Reception Threshold—SRTbeh
The stimuli were presented by means of Etymotic ER-1 insert Earphones (Etymotic Research, Inc., IL, USA) via a soundcard (RME Hammerfall DSB Multiface II, Audio AG, Germany) using disposable foam tips.
The SRTbeh was estimated using the Danish Hearing In Noise Test (HINT) sentences (Nielsen & Dau, 2011) and an adaptive procedure. The 50% word reception score was found using steady-state speech-shaped noise, with the first sentence starting at an SNR of −10 dB. The speech level was set at 65 dB sound pressure level (SPL), while the level of the steady-state speech-shaped noise was varied to obtain the desired SNR. Initially, two concatenated HINT training sentence lists, comprising 40 sentences, were used for training. Subsequently two concatenated HINT sentence lists were used to determine the SRTbeh (see details in Borges et al., 2025).
Experimental Paradigm and Setup for Speech Reception Threshold Estimated From EEG—SRTneuro
The stimuli were presented in the same setup as for the SRTbeh, with the exception that the sound tube from of the earphones was connected to the sound bore in the ear-EEG earpieces for stimulus presentation during the EEG recordings. Scalp and in-ear EEG were recorded concurrently with two separate amplifiers at a sampling rate of 4096 Hz. Scalp EEG was recorded using the Biosemi Active EEG (Amsterdam, Netherlands) system with a 64-channel cap and 2 external mastoid electrodes and reference (CMS) between PO3 and POz. Ear-EEG was recorded using a SAGA32+/64+ system (TMSi, Oldenzaal, Netherlands) with 12 in-ear ear-EEG electrodes and an additional Fpz electrode using an average reference during acquisition. The SAGA amplifier features a hardware-implemented average reference, eliminating the need for a specific reference electrode during data acquisition. The Fpz electrode location was recorded by both amplifiers to allow for merging the data from the two amplifiers. The 12 silver/silver chloride (Ag/AgCl) ear-EEG electrodes had a diameter of 4 mm and were placed in the individually molded earpieces in position ExA, ExB, ExC, ExT, ExI and ExK where x is replaced with R if placed on the right-ear earpiece and with L if placed on the left-ear earpiece. The naming convention for the in-ear electrodes is outlined in Kidmose et al. (2013), the design of the soft earpieces is described in Kappel et al. (2019), and the electrodes are described and characterized in Kappel and Kidmose (2022). For an illustration of the electrodes and their placement see Borges et al. (2024). Before inserting the in-ear-EEG earpieces, the participants’ ears were cleaned with a wet cotton swab. The quality of the signals recorded from the in-ear electrodes was assessed through visual inspection in a live viewer, specifically by checking for artefacts induced by eye movements and jaw clenches.
The SRTneuro was estimated by presenting audiobook excerpts, each lasting approximately 60 seconds. This length was chosen to balance data quality and participant fatigue. The audiobook excerpts were played to the participants at five different SNRs relative to the behaviorally measured SRT (SRTbeh −4 dB, SRTbeh −2 dB, SRTbeh, SRTbeh +2 dB, SRTbeh +4 dB) and without noise (clean speech). The SNR range was chosen to obtain an informative variation in speech reception. If the range is too narrow, the variation in speech reception will be small and mainly driven by inter-trial variability. Conversely, if the range is too wide, the speech reception may become nearly binary, resulting in the subject either understanding almost nothing or nearly everything. The presentation level of the speech was fixed at 65 dB SPL while the steady-state speech-shaped noise was adjusted according to the desired SNR. A total of 16 excerpts were presented at each SNR in a randomized order. The audiobook material was filtered using a first-order lowpass filter with a cutoff frequency of 2 kHz to approximate the third-octave band power spectral density of the HINT material and thereby enhance comparability between the audiobook material and the HINT material. To keep the participants engaged in the listening task, a two-alternative forced-choice content-related question followed each excerpt. After every 16 trials, participants were given a recreational task, such as describing their morning routine in as much detail as possible. See further details on the SRTneuro paradigm and experimental setup in Borges et al. (2025) and Borges et al. (2024).
Analysis
All analyses were conducted using MATLAB (The MathWorks Inc, Massachusetts, USA) and the mTRF toolbox (Crosse et al., 2016).
Stimulus Preprocessing
The envelopes of the audiobook excerpts were extracted as described in Borges et al. (2024). The broadband envelope of the analytical signal was obtained using the Hilbert transformation. Since the human auditory system has a compressive behavior, where louder sounds are not amplified as much as softer sounds in the presented stimuli, a power law of
EEG Preprocessing
The EEG data were processed as described in Borges et al. (2024). The in-ear EEG and scalp EEG were processed separately. Channels exhibiting a constant signal, such as those that were saturated, were first identified. The in-ear EEG electrodes were then referenced to the average of all in-ear electrodes, while the scalp electrodes were referenced to Fpz. Flat electrodes and electrodes with a standard deviation (SD) above three times the mean SD of the channels recorded with the same amplifier after highpass filtering with a cut-off frequency of 1 Hz were rejected. For the scalp EEG, the rejected channels were replaced using spherical interpolation, whereas for the in-ear EEG rejected electrodes were either omitted (for AEar referenced, see below) or replaced with the mean of the electrodes in the local area (Ear). Then all channels were referenced to Fpz, downsampled to 256 Hz, and bandpass filtered between 1 and 8 Hz using a zero-phase filter based on a sixth-order Butterworth polynomial. The filtered channels were further downsampled to 64 Hz and then epoched. To remove extreme artefacts all values greater than 100 μV and smaller than −100 μV were removed, along with 32 datapoints (0.5 s) before and after each artefact. To preserve the correlation structure of the signal, the removed points were reconstructed using autoregressive modeling, implemented with the “fillgaps” function in MATLAB. To evaluate the performance of the SRTneuro estimation across various electrode configurations, 13 configurations were selected, including in-ear electrodes and electrodes located around the ear (T7/8 and M1/2). In electrode configurations containing only two electrodes, one electrode was re-referenced to the other. In configurations with more than two electrodes, an average reference across all included electrodes was used. For reference, performance was also evaluated using a full-scalp electrode configuration, referenced to the mastoid average. An overview of the in- and around-ear electrode configurations is provided in Table 1. For graphical illustration, see Supplementary Material S1.
An Overview of the Investigated Electrode Configurations With Electrodes In and Around the Ear.
The abbreviation for the configuration is shown along with the active electrode(s) and the reference electrode(s).
Temporal Response Function
An encoding model was trained to predict the EEG response from the envelope of the presented stimuli. Leave-one-out cross-validation was applied within each condition, resulting in 16 encoder models trained per condition and participant. TRF estimation involves inversion of the stimulus property covariance matrix. To prevent overfitting, ridge regularization with a regularization parameter (
A subset of 17 channels was selected for subsequent TRF analysis (FC5, FC3, FC1, FCz, FC2, FC4, FC6, C3, C1, Cz, C2, C4, F3, F1, Fz, F2, and F4), as these channels demonstrated high prediction accuracy in previous studies (Fuglsang et al., 2017). A permutation test with
Relationships between TRF amplitudes, latencies, and SNR were examined using linear mixed models (six models in total) in RStudio (R Core Team, Vienna, version 4.3.2) with the nlme package (Lindstrom & Bates, 1990; Pinheiro & Bates, 1996). A mixed model was chosen to account for inter-subject variability in repeated measures. The relationship was investigated using the model
To compare the mean prediction accuracies between the two age groups within each condition, a cluster permutation test was used to control for multiple comparisons across electrodes. This analysis was performed using the “ft_freqstatistics” function in FieldTrip (Oostenveld et al., 2011). The following settings were used: Monte-Carlo estimates of the significance probabilities, independent samples
Speech Reception Threshold Estimation From EEG
The SRTneuro was estimated using a linear decoder that was trained on the clean speech data to reconstruct the envelopes of the speech signals contained in the presented speech-and-noise stimuli (Borges et al., 2024). The reconstruction accuracy was calculated as the Pearson's correlation between the actual and the reconstructed envelopes. The noise floor was found using the same approach as used for the encoder. The reconstruction accuracy for the clean speech condition was evaluated using leave-one-out cross-validation, where one clean-speech excerpt was used for testing and 15 clean-speech excerpts were used for training. This procedure was repeated 16 times, with a different test trial in each iteration.
To investigate whether the mean reconstruction accuracy in the ONH individuals was significantly higher than that of the YNH individuals, as observed in Presacco et al. (2016), a permutation test was conducted. This test had a precision of 0.01 and a significance level of 5%. The null hypothesis was that the mean reconstruction accuracy of the ONH individuals was not higher than that of the YNH individuals. The alternative hypothesis was that the mean reconstruction accuracy in the ONH individuals was significantly higher than that of the YNH individuals.
A sigmoid function was then fitted to the reconstruction-accuracy-versus-SNR data points for each individual participant. The function used for the fit, as described by Farris-Trimble and McMurray (2013), is given by the following equation:
Here,
The SRTneuro estimation was evaluated using three performance measures: i) the percentage of valid SRTneuro estimations, where higher percentages indicate the method's applicability to more participants; ii) the number of participants with a difference between SRTbeh and SRTneuro within ±3 dB, where higher numbers indicate better precision; and iii) the SD of the difference between SRTbeh and SRTneuro, where lower values indicate better precision.
The performance of the SRTneuro estimation was evaluated by calculating the difference between SRTbeh and SRTneuro. To test whether the mean of the difference between SRTbeh and SRTneuro was significantly different between the age groups, a permutation was conducted with a precision of 0.01 and a significance level of 5%. The null hypothesis was that there is no difference in the mean, and an alternative hypothesis was that there is a difference in the mean. To address whether there was a significant difference in the variance of these differences between the two age groups, a two-sampled F-test for equal variances was performed. The null hypothesis for the F-test was that the differences in the two groups come from a normal distribution with the same variance, and the alternative hypothesis was that the differences come from a normal distribution with different variance.
Results
The measured pure-tone thresholds for the ONH are shown in Figure 1a. The SRTbeh ranged from −5.4 dB to −4.0 dB, with a mean of −4.8 in the ONH group, see Figure 1b. To investigate age-related differences, this dataset was compared with data from 20 YNH participants collected in a previous study, in this group the SRTbeh ranged from −6.0 to −3.3 dB with a mean of −5.4 dB, see Figure 1b (for details see Borges et al., 2025).

(A) Pure-Tone Thresholds Obtained in the Left and Right Ear for Each of the ONH Participant (Thin Lines) Along with the Mean and the Standard Deviation Across the Population (Bold Lines and Error Bars). The Inclusion Criteria for the “Better” Ear are Shown with a Red Dotted Line, and the Maximum Threshold Allowed for the Other Ear (10 dB Higher) is Shown as a Solid Red Line. (B) Boxplot of the Measured SRTbeh Values for Both ONH and YNH Participants. The Red Line Indicates the Median, the Edges of the Blue Box Represent the 25th and 75th Percentile, the Black Whiskers Mark the Most Extreme Values, and a Red Cross Marks Outliers Defined as a Value More than 1.5 Times the Interquartile Range Away from the Bottom or Top of the Box. SRTbeh Values for Individual Participants are Shown as Black Circles (Horizontally Jittered for Improved Readability).
In Figure 2a, the resulting grand average TRFs for the two age groups are shown. Higher overall amplitudes are observed for the ONH individuals (solid lines) compared to the YNH individuals (dotted lines), while no clear differences in latency are evident between the two groups. TRFs for the individual participants can be found in Supplementary Material S2. The number of conditions for which a given component was not included on the participant level, due to an R2 for the Gaussian function being at or below 0.5 or the prediction accuracy not being significantly above the noise floor, is reported in Supplementary Material S3.

(A) The Grand Average TRF from the ONH Individuals Shown by Solid Lines and YNH Individuals Shown by Dotted Lines in the −100 to 400 ms Time Window for All Six SNRs. (B) T-Values from the Cluster Permutation Test, Comparing the Mean Prediction Accuracies Between the Two Age Groups (ONH - YNH) within Each Condition at the Participant Level. Electrodes Included in the Significant Cluster are Marked with a Star Symbol.
Figure 2b shows topography plots of
The Results From the Mixed Linear Model Analyzing TRF Amplitudes and Latencies Relative to SNR and Age Group, Along with the Corresponding Coefficients and Results From the Wald Test.
Statistically significant coefficients are shown in bold font.
An increased overall reconstruction accuracy for the ONH individuals can be observed in Figure 3. The permutation test, with the null hypotheses that the mean reconstruction accuracy of the ONH individuals is not larger than that of the YNH individuals, yielded a

Reconstruction Accuracy for the YNH and ONH Groups Along with the Noise Floor for Each Group. The Mean for Each Group is Shown in Bold, and the Shaded Areas Represent ±1 Standard Deviation Around the Mean of the Reconstruction Accuracy for the Group.
Table 3 and Figure 4 show that all ONH individuals obtained an SRTneuro within 3 dB of their SRTbeh when using the Scalp electrode configuration. The SD of the difference between the SRTbeh and the SRTneuro was 1.2 dB, with a median of 0.1 dB. A similar trend was observed for the YNH individuals. For the Scalp configuration, the permutation test with the null hypothesis of no between-age-group difference in the mean of the difference between SRTbeh and SRTneuro was non-significant (

The Upper Panel Shows a Bar Plot of the Percentage of Participants for Whom a Valid SRTneuro was Obtained, with Data from the YNH Individuals Shown in Blue and Data from the ONH Individuals Shown in Red. The Lower Panel Depicts a Boxplot of the Difference Between SRTbeh and SRTneuro; the Median is Shown as a Red Line, the 25th and 75th Percentiles are Indicated by the Boxes, and the Whiskers Indicate Extreme Values for Non-Outliers. The Participant-Specific Data Points are Depicted as Black Circles. Outliers are Defined as Datapoints more than 1.5 Times the Interquartile Range Away from the Top or Bottom of the Box and Marked with a Red Cross. 3-dB Limits are Shown as Green Dotted Lines.
Comparison of the Difference Between SRTbeh and SRTneuro for the ONH and YNH Individuals.
The first column shows the percentage of valid SRTneuro estimates out of all included participants in the age group. The following columns show the percentage of participants with a difference between SRTbeh and SRTneuro within 3 dB out of all valid SRTneuro estimates in the age group, and the SD of the difference between SRTbeh and SRTneuro for the valid SRTneuro estimates in the age group.
For the ONH individuals, SRTneuro estimates based on in-ear EEG electrodes (Ear and AEar) were obtained for 41% of participants, with SDs of 2.7 dB (Ear) and 1.9 dB (AEar). The proportion of participants with SRTneuro values within ±3 dB of SRTbeh was 23% for Ear and 32% for AEar. These results closely resembled those obtained for the YNH individuals. Estimations obtained with the mastoid electrodes only (M) were roughly similar. The percentage of SRTneuro estimates increased substantially for the ONH group when using the temporal electrodes (
When combining the electrodes from the in-ear EEG configurations (Ear and AEar) with mastoids (EarM and AEarM), the results were very similar to using the mastoid or in-ear electrodes separately for the ONH individuals. In contrast, combining the in-ear and mastoid electrodes (EarM) for the YNH individuals resulted in an increase of reliable SRTneuro estimations as compared to the M and Ear configurations. However, the percentage of participants within 3 dB for the EarM configuration was similar to that obtained for the Ear and M configurations, and the SD increased for the EarM configuration compared to the M and Ear configurations. When comparing the AEarM to the separate configurations (AEar and M) in the YNH individuals, a similar trend was observed, with an increase in reliable SRTneuro estimations, but a higher SD in AEarM compared to AEar and M.
When combining electrodes from the Ear configurations (AEar and Ear) with the temporal electrodes, the number of estimates roughly doubled, accompanied by a doubling also in the number of ONH participants with an SRTneuro within 3 dB difference of the SRTbeh. Furthermore, the SD decreased compared to the T configuration but increased compared to the Ear and AEar configurations. A similar trend was found in the YNH individuals. When combining the mastoid electrodes and the temporal electrodes (MT), very similar results for the ONH and YNH individuals were obtained, with a better SRTneuro estimation compared to configurations using the temporal and mastoid electrodes separately (M and T).
The electrode configurations combining the in-ear electrodes and the mastoid and temporal electrodes (EarMT and AEarMT) did not improve the SRTneuro estimation of ONH participants compared to using the MT configuration. For the YNH participants, on the other hand, there was an increase in SRTneuro estimates within 3 dB difference of SRTbeh and the amount of reliable SRTneuro estimates for the AEarMT configuration compared to MT, as well as a minor increase in SD. For EarMT an increase in reliable SRTneuro estimates was found, but also an increase of SD and a decrease in the number of SRTneuro within ±3 dB of SRTbeh.
Using only electrodes from one side of the head for the ONH individuals revealed similar results for the left side (LAEarMT) compared to both sides (AEarMT), while a slight decrease in SRTneuro estimation quality was observed for the right side (RAEarMT) compared to both sides (AEarMT).
For the YNH individuals, the right side (RAEarMT) showed only minor differences in estimation quality compared to both sides (AEarMT), whereas the left side (LAEarMT) yielded lower estimation quality than when using electrodes from both sides (AEarMT). In the YNH individuals, RAEarMT performed almost identical to the Scalp configuration. This was not the case for the ONH individuals where the best performing side (LAEarMT) yielded a lower overall number of SRTneuro estimates (86% vs. 100%), fewer SRTneuro estimates within 3 dB difference from the measured SRTbeh (73% vs. 100%), and a higher SD of the difference between SRTneuro and SRTbeh (2.1 dB vs. 1.2 dB) compared to the Scalp configuration.
Discussion
Summary of the Main Results
A statistically significant increase in stimulus reconstruction accuracy was observed for the ONH compared to the YNH individuals, see Figure 3, along with enhanced EEG prediction accuracy in the EEG fronto-centrally for the ONH individuals compared to the YNH individuals, see Figure 2b. The ONH group had enhanced amplitude of both P1, N1, and P2 relative to the YNH group (i.e., a positive fixed effect for P1 and P2 and a negative effect for N1, Table 2). The SNR of the presented stimuli was a significant predictor of the latency of all components, and for the amplitude of N1 and P2.
Regarding SRT estimation, no statistically significant difference between the two age groups was found for the mean and variance of the differences between SRTbeh and SRTneuro when using the Scalp configuration, indicating that there was no difference in SRT estimation quality across the two groups. In the in- and around-ear configurations, the difference between SRTbeh and SRTneuro was generally small between the two age groups, with some exceptions: (i) the SRT estimations for YNH individuals improved more when temporal electrodes were used as compared to in-ear electrodes than for ONH individuals, (ii) an increase in the number of SRTneuro estimates was observed when using the EarM configuration compared to M and Ear in the YNH individuals but not in the ONH individuals, (iii) when using electrodes from only one side of the head, the best SRTneuro estimate for the ONH individuals was obtained using the left side (LAEarMT configuration), whereas the best estimate for the YNH individuals was obtained using the right side (RAEarMT configuration), (iv) the estimation quality for the AEarMT/RAEarMT configuration was similar to the Scalp configuration (i.e., very high) for YNH individuals whereas it was slightly reduced as compared to the Scalp configuration for the ONH individuals.
Age-Related Differences in Reconstruction Accuracy and TRFs
The TRFs in Figure 2a show an increase in amplitude in the ONH individuals compared to the YNH individuals, which was confirmed in the statistical test showing that age group was a significant predictor of amplitude for all components (P1, N1, and P2). This increase of amplitude, and thereby SNR of the EEG signal, likely contributed to the enhanced reconstruction accuracy seen in the ONH individuals compared to the YNH individuals. The SNR of the stimuli were a significant predictor of the latency of all components and of the amplitude of N1 and P2. Caution is advised when interpreting changes in the latency and amplitude of peaks and troughs in the TRF waveform, as these features do not directly reflect the amplitude and latency of the underlying neural components. For instance, a change in the amplitude of a neural component can affect the latencies and amplitudes of multiple features in the TRF waveform, (Luck & Kappenman, 2012). Furthermore, this study cannot specify the underlying causes of the observed differences, as the analysis conducted here only confirms their existence. Identifying the true causation of the changes would require additional experiments and is beyond the scope of the present study. However, an increasing latency with decreasing SNR may suggest an increase of processing, potentially delaying the neural response due to the additional neural processing required in low-SNR conditions. The decrease in N1 and P2 amplitude magnitude with lower SNR likely reflects the TRF encoder's challenge in tracking the speech envelope. As the speech envelope becomes masked by additive noise, the neural tracking of the speech envelope deteriorates, resulting in a lower TRF amplitude magnitude. In this study, no evidence was found to support the SNR as a reliable predictor of the P1 amplitude, suggesting that the P1 amplitude as a response to the speech stimulus envelope is less affected by SNR levels.
Enhanced envelope reconstruction accuracy in ONH individuals compared to YNH individuals has been observed in previous studies (Decruy et al., 2019; Karunathilake et al., 2023; McClaskey, 2024; Presacco et al., 2016) as well as increased TRF peak amplitudes (Karunathilake et al., 2023; Panela et al., 2024). However, in contrast to the present results, prediction accuracy has been found to decrease with age (Gillis et al., 2023), but this could be due to methodological differences, that is, using spectrogram and acoustic onset instead of envelope. This enhancement of the auditory response in ONH has been speculated to result from an excitatory/inhibitory imbalance (Alain et al., 2014), resulting in cortical hyperactivity in the auditory cortex including increased spontaneous neural firing, increased synchronization amongst neurons, and enhanced sound evoked responses (Herrmann & Butler, 2021).
Yet there is evidence that it could be a cortical effect of hearing loss rather than increasing age when it comes to neural fundamental-frequency tracking (Van Canneyt et al., 2021). Animal models have furthermore shown frequency-specific increases in spontaneous neuronal firing rate following noise exposure, linking the hyperactivity to hearing loss (Eggermont, 2015; Eggermont & Tass, 2015; Seki & Eggermont, 2003). Similar hyperactivity has also been observed in aging animal models in the absence of noise exposure. In these cases, age-related inner-ear dysfunctions such as degeneration of hair cells, the stria vascularis, and spiral ganglion cells are believed to underlie the observed hyperactivity (Bao & Ohlemiller, 2010; Dubno et al., 2013; Gratton & Vázquez, 2003; Keithley, 2020; Moore, 1987; Plack, 2014; Schmiedt, 2010).
Another potential explanation is recruitment of additional cortical regions to process the same stimuli in ONH, compensating for a reduction in the specialized processing regions (Brodbeck et al., 2018; Peelle et al., 2010). Karunathilake et al. (2023) also investigated the latency changes of M50trf, M100trf, and M200trf, that is, the MEG counterparts of the P1, N1, and P2 deflections discussed in the current study. They found significant noise-related delays in latencies of these components, which support the findings of the current study. It is important to note that the study by Karunathilake et al. (2023) utilized babble noise, whereas the present study employed steady-state speech-shaped noise. It is promising for the application of the SRTneuro method that some of the same neurological changes were also found when using more naturalistic noise such as babble noise in the stimuli. Other studies have also reported that the latencies of TRF deflections decrease and their amplitudes increase in magnitude with higher SNR levels in YNH individuals using MEG (Ding & Simon, 2013) and in preschool children using EEG (Van Hirtum et al., 2023).
Age-Related Differences in SRTneuro Estimation
There was no statistical evidence for an age-group difference in mean or variance of the differences between SRTbeh and SRTneuro when using the Scalp configuration, despite the overall increase in reconstruction accuracy in the ONH individuals compared to the YNH individuals, as shown in Figure 3. This suggests that the mean value of the fitted sigmoid function (the SRTneuro) remains unaffected by changes in overall reconstruction accuracy, which is a highly desirable characteristic of the SRTneuro estimation method. Furthermore, the observation that the variance of the estimate is also independent of reconstruction accuracy level demonstrates the robustness of the estimation method to variability in reconstruction accuracies across individuals. A possible explanation for this robustness lies in the nature of the sigmoid fitting procedure, which takes the individual reconstruction accuracy level into account, which can be considered as a form of normalization of the neural tracking strength of the individual participant. However, it is somewhat unclear why an improvement in reconstruction accuracy does not translate into a higher percentage of reliably estimated SRT values.
In the current study, the SRTneuro obtained from electrodes in and around the ears showed comparable results for the two age groups, with a few exceptions (see Figure 4 and Table 3). For the YNH individuals, the number of reliable SRTneuro estimates increased when using the EarM configuration compared to using M or Ear. There was no increase in the number of YNH participants with an SRTneuro within ±3 dB difference of the SRTbeh, and the SD of the difference between SRTneuro and SRTbeh increased. This suggests that the increase in reliable SRTneuro estimates came at the expense of the overall precision of the estimates. This trend was not observed in the ONH individuals, indicating that the difference observed between YNH and ONH may be due to random fluctuations. The YNH individuals showed a more substantial benefit from using only temporal electrodes compared to in-ear electrodes than the ONH individuals. This could be due to an enhanced response in the ONH individuals in the areas outside the core auditory cortex (Brodbeck et al., 2018). When the response area is broader, neighboring channels may capture more synchronized activity, resulting in reduced additional information in the T electrodes, due to the recorded activity being similar between T electrodes and in-ear electrodes.
When using electrodes from only one side of the head, the best SRTneuro estimates in the ONH individuals were found on the left side (LAEarMT configuration), whereas for the YNH individuals they were found for the right side (RAEarMT). A study by Brodbeck et al. (2018) compared the prediction accuracy of the envelope for ONH and YNH and found that there was an increased prediction accuracy in ONH and that it was particularly pronounced in the left temporal lobe. This increased activity on the left side of the head with age could explain the benefit of estimating SRTneuro based on left-side electrodes for the ONH individuals but not for the YNH, as observed in the current study. This is further supported by the fact that the cluster permutation test in the current study revealed enhanced prediction accuracy for the ONH in a fronto-central area across all SNR conditions, with more electrodes in the cluster around the left ear compared to the right, see Figure 2b. In the current study, using only electrodes from the “better” side of the head for the ONH individuals did not yield SRTneuro estimation performance comparable to that obtained when using all scalp electrodes, whereas for the YNH individuals this was the case. This could be due to enhanced recruitment of neurons in the areas close to the temporal lobe in ONH resulting in a smaller difference in the potential measured by the neighboring channel for the ONH compared to the YNH individuals (Brodbeck et al., 2018).
In the current study, the SNRs were chosen in 2 dB steps around the SRTbeh. Previous work (Borges et al., 2025) investigated whether this SNR selection strategy biased the SRTneuro estimation. To assess this, the same SRTneuro estimation methods were applied to data sets simulated based on the same underlying function but sampled for different SNR ranges. In particular, the SNR range was moved by −4 dB to 4 dB in 1 dB intervals (9 distinct SNR sets). This analysis showed no evidence of the SNR selection biasing the SRTneuro estimation.
Application
SRTneuro provides a continuous measure of the SRT, offering new opportunities for real-world assessment and adaptation in hearing care. Logging of SRTneuro during daily-life situations could inform optimization of hearing-aid performance and support personalized rehabilitation strategies. Furthermore, continuous assessment of SRTneuro based on uncontrolled natural speech could enable hearing aids to dynamically adjust their performance in real time to optimize the user's speech intelligibility.
The present study shows that the SRTneuro can be estimated based on in-ear-EEG alone in the ONH individuals with similar precision as in YNH individuals, especially when also including electrodes around the ear. The SRTneuro estimation was independent of age group even though higher reconstruction accuracies were observed in the ONH individuals. This is an advantage for an automatic SRTneuro estimation, as it suggests that the estimation method is robust with respect to effects of age on reconstruction accuracy and therefore does not need to be specifically tailored to different age groups. The SRTneuro can be estimated using electrodes in and around the ear from one side of the head, with slightly lower precision in the ONH individuals compared to using all scalp electrodes, but with the same precision as obtained for the full scalp configuration in the YNH individuals. However, the better side for precise SRTneuro estimation when only using electrodes in and around the ear from one side differed between ONH and YNH groups. If a SRTneuro measurement platform was to be used across age groups there are thus three options: (i) electrodes from both ears could be used for the SRTneuro estimation, (ii) the better ear for the participant could be identified and used for the SRTneuro estimation, or (iii) the ear used for the SRTneuro estimation could be determined by age. Since hearing impairment is associated with a higher age (Gratton & Vázquez, 2003), and in the current study the ONH group obtained more precise SRTneuro estimation when using electrodes from the left side, it is likely that a left sided electrodes placement would yield good results for most hearing-aid users. Using electrodes from only one side of the head allows to obtain the SRTneuro without connecting the two hearing aids and would therefore be more applicable. Another solution that does not require a connection between the two hearing aids could be the use of two different reference systems (one for each side) as input for the SRTneuro estimation; this method has not been explored in the current study.
Applying the proposed method in an actual clinical context, where the behavioral SRT (SRTbeh) is unknown, the method would likely require sampling across a broader range of SNRs to adequately capture the informative portion of the underlying sigmoid function. This would come at the cost of increased measurement duration.
In the ONH population, only the Scalp electrode configuration yielded an SRTneuro estimate for all participants. This is due to the quality requirements implemented for the sigmoid fit and reconstruction accuracy datapoints. If ear-EEG electrode configurations were to be implemented in practice, these quality requirements should be revisited to ensure that SRTneuro fits are only conducted when the reconstruction accuracy is reliably tracked. Furthermore, if the increase in reconstruction accuracy is not reliably tracked, additional data could be recorded and included until this is the case.
The method would likely be improved by training the model on more data, since the standard error of the mean generally decreases with
Limitations
The SRTneuro estimation relies on the envelope-following response in the EEG rather than on speech-intelligibility scores. Here it is important to note that the envelope following response reflects an encoding of acoustic features, not necessarily comprehension. While the two measures are highly correlated (Ding & Simon, 2013; Iotzov & Parra, 2019; Shannon et al., 1995; Vanthornhout et al., 2018), an envelope-following response is likely a prerequisite—rather than sufficient
The study is limited by the amount of data collected. Given that in-ear EEG is well-suited for long-term monitoring, the collection of additional data should enhance the decoding model and improve the precision of the SRTneuro estimation.
If the SRTneuro was implemented in a hearing aid, many unknown factors could influence the estimation, such as different types of noise variations in room acoustics and acoustic features of the target speech. Furthermore, the method has not been explored in hearing-impaired individuals, where a larger variation in the SRTs is expected, therefore, to strengthen the generalizability of the study it would be a logical next step to conduct a study with hearing-impaired individuals. This would allow for validation across a wider SRT range and provide insights into individual differences. In the current study, the YNH and ONH groups had relatively balanced hearing threshold levels, and the behavioral SRTs measured in the two groups were very similar with little interindividual variation. However, it should be noted that differences in speech intelligibility and other psychoacoustic measures between younger and older listeners with normal hearing are expected (see Goossens et al., 2017; 53-54; Regev, Oxenham, et al., 2025; Regev, Zaar, et al., 2025; Working Group on Speech Understanding & Aging, 1988), although these differences may not necessarily emerge in the conditions used in the current study (speech mixed with speech-shaped noise).
It is unclear how left/right imbalances in hearing loss may affect the quality of the SRTneuro estimation from each side. A study by Presacco et al. (2019) compared the reconstruction accuracy of the envelope between ONH individuals and older hearing-impaired individuals, finding no significant differences between the two populations. However, one study found that elevated hearing thresholds and impaired speech intelligibility were associated with an increased correlation between the EEG and the amplitude envelope of the presented stimuli (Schmitt et al., 2022), and an increase in cortical responses to sound in hearing-impaired compared to older adults has also been observed (Alain et al., 2014; Millman et al., 2017; Tremblay et al., 2003). In the current study, enhanced reconstruction accuracy did not have an effect on the SRTneuro estimation quality. However, changes in neural activation due to hearing loss may interact differently with the SRT estimation method used in the current study than changes in neural activation due to age.
The current study demonstrates the feasibility of estimating SRTneuro using electrodes in and around the ear. While integrating advanced technology such as ear-EEG into hearing aids could pave the way for neuro-steered hearing aids, with SRTneuro estimation serving as a concrete example of the potential opportunities, incorporating such technologies also introduces challenges that must be carefully balanced against user benefits and economic costs. Several of these challenges, though beyond the scope of the current study, are worth highlighting: (i) increased power consumption in an already power-constrained device; (ii) limited physical space for integrating electrodes and supporting electronics; (iii) vulnerability to electromagnetic interference from both internal and external sources; (iv) reduced control over recording conditions in real-world settings, where environmental noise may affect the SNR; (v) privacy and ethical concerns related to continuous EEG recording. Such considerations are crucial for the successful implementation and acceptance of neuro-steered hearing aids.
Conclusion
The ONH individuals showed similar SRTs to their YNH peers, while also exhibiting an overall increase in envelope reconstruction accuracy. However the precision of the SRTneuro estimate did not significantly differ across the two age groups. For scalp EEG, the SRTneuro was estimated with good precision in all participants in both the YNH and ONH groups. When restricting the estimation to in-the-ear electrodes, the number of individuals with an estimated SRTneuro decreased to 45% and 41% for the YNH and ONH groups, respectively. When combining in-the-ear and around-the-ear electrodes, the maximum percentage of individuals with an estimated SRTneuro was 100% for the YNH and 86% for the ONH. An analysis of spatiotemporal responses through the TRF revealed that the ONH group exhibited increased amplitudes of the P1 (∼50 ms), N1 (∼120 ms) and P2 (∼200 ms) deflections compared to the YNH group. TRF latencies decreased with increasing SNR, while the amplitudes of the N1 and P2 deflections increased as the SNR increased. Overall, these findings demonstrate the robustness of the SRTneuro estimation method with regard to age-related changes in neural speech-envelope tracking.
Supplemental Material
sj-pdf-1-tia-10.1177_23312165251372462 - Supplemental material for Age-Related Differences in EEG-Based Speech Reception Threshold Estimation Using Scalp and Ear-EEG
Supplemental material, sj-pdf-1-tia-10.1177_23312165251372462 for Age-Related Differences in EEG-Based Speech Reception Threshold Estimation Using Scalp and Ear-EEG by Heidi B Borges, EminaAlickovic, Christian B Christensen, Preben Kidmose and Johannes Zaar in Hearing
Footnotes
Acknowledgments
Ethical Considerations
Consent to Participate
Consent for Publication
Authors’ contributions
Funding
Declaration of Conflicting Interests
Data Availability Statement
Supplemental Material
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
