Sage Journals: Discover world-class research

Abstract

Previous studies have demonstrated the feasibility of estimating the speech reception threshold (SRT) based on electroencephalography (EEG), termed SRT_neuro, in younger normal-hearing (YNH) participants. This method may support speech perception in hearing-aid users through continuous adaptation of noise-reduction algorithms. The prevalence of hearing impairment and thereby hearing-aid use increases with age. The SRT_neuro estimation is based on envelope reconstruction accuracy, which has also been shown to increase with age, possibly due to excitatory/inhibitory imbalance or recruitment of additional cortical regions. This could affect the estimated SRT_neuro. This study investigated the age-related changes in the temporal response function (TRF) and the feasibility of SRT_neuro estimation across age. Twenty YNH and 22 older normal-hearing (ONH) participants listened to audiobook excerpts at various signal-to-noise ratios (SNRs) while EEG was recorded using 66 scalp electrodes and 12 in-ear-EEG electrodes. A linear decoder reconstructed the speech envelope, and the Pearson's correlation was calculated between the reconstructed and speech-stimulus envelopes. A sigmoid function was fitted to the reconstruction-accuracy-versus-SNR data points, and the midpoint was used as the estimated SRT_neuro. The results show that the SRT_neuro can be estimated with similar precision in both age groups, whether using all scalp electrodes or only those in and around the ear. This consistency across age groups was observed despite physiological differences, with the ONH participants showing higher reconstruction accuracies and greater TRF amplitudes. Overall, these findings demonstrate the robustness of the SRT_neuro method in older individuals and highlight its potential for applications in age-related hearing loss and hearing-aid technology.

Keywords

ear-EEG EEG speech reception threshold neural decoding speech intelligibility in noise

Introduction

Understanding speech in noise is a critical and challenging task for hearing-aid users (Kochkin, 2002). A common method to evaluate this ability is by measuring the speech reception threshold (SRT), a behavioral measure typically defined as the signal-to-noise ratio (SNR) at which a participant can accurately repeat 50% of the presented speech items (SRT_beh). This approach, however, relies on active participation of the listener, which might not always be feasible.

To address this limitation, recent research has explored the use of electroencephalography (EEG) measured while the subjects are listening to speech-in-noise stimuli to estimate the SRT (SRT_neuro) (Borges et al., 2025; Lesenfants et al., 2019). This approach offers a promising alternative for assessing speech intelligibility without relying on active behavioral responses.

Researchers have estimated the SRT_neuro with scalp EEG and linear models in younger normal-hearing (YNH) listeners. Both encoding (forward) and decoding (backward) models have been used to estimate SRT_neuro (Borges et al., 2025; Lesenfants et al., 2019; Vanthornhout et al., 2018), with decoders offering robust speech envelope reconstruction and encoders providing spatial and temporal insights through the temporal response function (TRF) (Alickovic et al., 2019). The TRF is similar to the event-related potential (ERP). However, unlike ERPs, which average responses over multiple repetitions of discrete stimuli (Luck, 2014), the TRF can be calculated for continuous and non-repeated stimuli and reflects only the selected stimulus feature, not the entire stimulus. A decoder model that reconstructs the speech envelope from EEG was used to estimate the SRT_neuro as the midpoint of a sigmoid function fitted to SNR-versus-reconstruction-accuracy datapoints (Borges et al., 2025; Vanthornhout et al., 2018). This approach is inspired by a standardized method to determine the SRT_beh. In a previous study using whole scalp EEG and this method, an SRT_neuro within 3 dB of the SRT_beh was achieved for 100% of participants and within 2 dB for 75% (Borges et al., 2025).

Existing studies have focused on younger adults. Since age-related hearing loss is the most common type of hearing impairment (Gratton & Vázquez, 2003), it is relevant to investigate the SRT_neuro estimation in older adults. Age-related changes include enhanced neural tracking of speech, as evidenced by increased envelope reconstruction accuracy from EEG (Karunathilake et al., 2023; Presacco et al., 2016). These age-related neural changes could potentially bias the SRT_neuro estimation or challenge the method in other ways.

Importantly, SRT_neuro offers advantages beyond traditional behavioral testing. It enables a more automated, passive estimation procedure, which can be useful for non-responsive individuals such as younger children or individuals with cognitive decline. Furthermore, continuous measurement of speech intelligibility could support adaptive hearing-aid technologies or enable real-time monitoring of listening conditions. However, the use of full-scalp electrodes is impractical for integration into hearing aids. To address this limitation, a recent study investigated the feasibility of measuring SRT_neuro using a more discreet and unobtrusive approach: electrodes placed in and around the ear in YNH individuals (Borges et al., 2024). It was found that SRT_neuro estimates derived from the electrode configuration of in-ear EEG and 4 electrodes around the ear closely matched the SRT_beh, with results similar to SRT_neuro estimates obtained based on a full-scalp array with 66 electrodes. This finding is an important step in the direction of potential integration of SRT_neuro measurements into hearing- aids and other ear-worn devices.

Despite these advances the feasibility of using scalp and ear-EEG for SRT_neuro estimation in older normal-hearing (ONH) individuals remains unexplored. The present study aims to fill this gap by evaluating SRT_neuro estimation in a ONH group using both scalp and ear-EEG configurations, and compare results with those obtained for a YNH group (Borges et al., 2025). A secondary objective was to investigate noise-induced changes in the TRF in ONH participants in order to better understand the neurological changes that could impact the SRT_neuro estimation.

Materials and Methods

Participants

In this study, a new data set was collected from 22 ONH participants (16 females, 6 males), aged 57 to 76 years (mean age: 65 years) and combined with an existing data set from 20 YNH participants, originally collected for another study (Borges et al., 2025). The inclusion criteria for the ONH individuals were identical to those used for the YNH study: right-handed, no significant dyslexia affecting daily life, and no history of neurological disorders (Borges et al., 2025), with the exception of the age and hearing threshold requirements. The YNH population was required to be between 18 and 30 years old and have normal hearing defined as pure-tone thresholds of max 20 dB HL (dB hearing level) at 0.125, 0.25, 0.5, 0.75, 1, 1.5, 2, 3, 4, 6, and 8 kHz. The ONH participants were required to be above 50 years old and the ear with the lower threshold (i.e., the “better” ear) was required to meet the following criteria: a maximum threshold of 25 dB HL in the frequency range of 0.125 to 3 kHz, 30 dB HL at 4 kHz, 40 dB HL at 6 kHz, and 45 dB HL at 8 kHz. The other ear was allowed to have a 10-dB higher maximum threshold. The study was approved by Aarhus University's Institutional Review Board with the approval number 2023-014.

Experimental Setup

The experimental paradigm and setup followed the protocol outlined in the previous study (Borges et al., 2025). The study consisted of three visits. During the first visit, the SRT_beh was estimated, the Edinburgh Handedness Inventory test (Oldfield, 1971) was conducted, a reading span test (Daneman & Carpenter, 1980) was performed, and ear impressions were taken. The second visit involved data collection for a separate study and is therefore not further described here. The third visit involved EEG recordings for the current study.

Behaviorally Estimated Speech Reception Threshold—SRT_beh

The stimuli were presented by means of Etymotic ER-1 insert Earphones (Etymotic Research, Inc., IL, USA) via a soundcard (RME Hammerfall DSB Multiface II, Audio AG, Germany) using disposable foam tips.

The SRT_beh was estimated using the Danish Hearing In Noise Test (HINT) sentences (Nielsen & Dau, 2011) and an adaptive procedure. The 50% word reception score was found using steady-state speech-shaped noise, with the first sentence starting at an SNR of −10 dB. The speech level was set at 65 dB sound pressure level (SPL), while the level of the steady-state speech-shaped noise was varied to obtain the desired SNR. Initially, two concatenated HINT training sentence lists, comprising 40 sentences, were used for training. Subsequently two concatenated HINT sentence lists were used to determine the SRT_beh (see details in Borges et al., 2025).

Experimental Paradigm and Setup for Speech Reception Threshold Estimated From EEG—SRT_neuro

The stimuli were presented in the same setup as for the SRT_beh, with the exception that the sound tube from of the earphones was connected to the sound bore in the ear-EEG earpieces for stimulus presentation during the EEG recordings. Scalp and in-ear EEG were recorded concurrently with two separate amplifiers at a sampling rate of 4096 Hz. Scalp EEG was recorded using the Biosemi Active EEG (Amsterdam, Netherlands) system with a 64-channel cap and 2 external mastoid electrodes and reference (CMS) between PO3 and POz. Ear-EEG was recorded using a SAGA32+/64+ system (TMSi, Oldenzaal, Netherlands) with 12 in-ear ear-EEG electrodes and an additional Fpz electrode using an average reference during acquisition. The SAGA amplifier features a hardware-implemented average reference, eliminating the need for a specific reference electrode during data acquisition. The Fpz electrode location was recorded by both amplifiers to allow for merging the data from the two amplifiers. The 12 silver/silver chloride (Ag/AgCl) ear-EEG electrodes had a diameter of 4 mm and were placed in the individually molded earpieces in position ExA, ExB, ExC, ExT, ExI and ExK where x is replaced with R if placed on the right-ear earpiece and with L if placed on the left-ear earpiece. The naming convention for the in-ear electrodes is outlined in Kidmose et al. (2013), the design of the soft earpieces is described in Kappel et al. (2019), and the electrodes are described and characterized in Kappel and Kidmose (2022). For an illustration of the electrodes and their placement see Borges et al. (2024). Before inserting the in-ear-EEG earpieces, the participants’ ears were cleaned with a wet cotton swab. The quality of the signals recorded from the in-ear electrodes was assessed through visual inspection in a live viewer, specifically by checking for artefacts induced by eye movements and jaw clenches.

The SRT_neuro was estimated by presenting audiobook excerpts, each lasting approximately 60 seconds. This length was chosen to balance data quality and participant fatigue. The audiobook excerpts were played to the participants at five different SNRs relative to the behaviorally measured SRT (SRT_beh −4 dB, SRT_beh −2 dB, SRT_beh, SRT_beh +2 dB, SRT_beh +4 dB) and without noise (clean speech). The SNR range was chosen to obtain an informative variation in speech reception. If the range is too narrow, the variation in speech reception will be small and mainly driven by inter-trial variability. Conversely, if the range is too wide, the speech reception may become nearly binary, resulting in the subject either understanding almost nothing or nearly everything. The presentation level of the speech was fixed at 65 dB SPL while the steady-state speech-shaped noise was adjusted according to the desired SNR. A total of 16 excerpts were presented at each SNR in a randomized order. The audiobook material was filtered using a first-order lowpass filter with a cutoff frequency of 2 kHz to approximate the third-octave band power spectral density of the HINT material and thereby enhance comparability between the audiobook material and the HINT material. To keep the participants engaged in the listening task, a two-alternative forced-choice content-related question followed each excerpt. After every 16 trials, participants were given a recreational task, such as describing their morning routine in as much detail as possible. See further details on the SRT_neuro paradigm and experimental setup in Borges et al. (2025) and Borges et al. (2024).

Analysis

All analyses were conducted using MATLAB (The MathWorks Inc, Massachusetts, USA) and the mTRF toolbox (Crosse et al., 2016).

Stimulus Preprocessing

The envelopes of the audiobook excerpts were extracted as described in Borges et al. (2024). The broadband envelope of the analytical signal was obtained using the Hilbert transformation. Since the human auditory system has a compressive behavior, where louder sounds are not amplified as much as softer sounds in the presented stimuli, a power law of x(t)^0.6, where x(t) represents the envelope at time t, was applied. This approach is designed to compensate for the cochlear compression, and is a common way to model this phenomenon (Biesmans et al., 2017). The resulting compressed envelope was bandpass filtered between 1 and 8 Hz using a zero-phase filter implemented in MATLAB with the “filtfilt” function, employing a sixth-order Butterworth polynomial. This filter was implemented due to its desirable frequency response with a flat passband and appropriate steepness for the application preserving the signal of interest whilst attenuating unwanted frequencies. The filtered envelope was then downsampled to 64 Hz for further analysis.

EEG Preprocessing

The EEG data were processed as described in Borges et al. (2024). The in-ear EEG and scalp EEG were processed separately. Channels exhibiting a constant signal, such as those that were saturated, were first identified. The in-ear EEG electrodes were then referenced to the average of all in-ear electrodes, while the scalp electrodes were referenced to Fpz. Flat electrodes and electrodes with a standard deviation (SD) above three times the mean SD of the channels recorded with the same amplifier after highpass filtering with a cut-off frequency of 1 Hz were rejected. For the scalp EEG, the rejected channels were replaced using spherical interpolation, whereas for the in-ear EEG rejected electrodes were either omitted (for AEar referenced, see below) or replaced with the mean of the electrodes in the local area (Ear). Then all channels were referenced to Fpz, downsampled to 256 Hz, and bandpass filtered between 1 and 8 Hz using a zero-phase filter based on a sixth-order Butterworth polynomial. The filtered channels were further downsampled to 64 Hz and then epoched. To remove extreme artefacts all values greater than 100 μV and smaller than −100 μV were removed, along with 32 datapoints (0.5 s) before and after each artefact. To preserve the correlation structure of the signal, the removed points were reconstructed using autoregressive modeling, implemented with the “fillgaps” function in MATLAB. To evaluate the performance of the SRT_neuro estimation across various electrode configurations, 13 configurations were selected, including in-ear electrodes and electrodes located around the ear (T7/8 and M1/2). In electrode configurations containing only two electrodes, one electrode was re-referenced to the other. In configurations with more than two electrodes, an average reference across all included electrodes was used. For reference, performance was also evaluated using a full-scalp electrode configuration, referenced to the mastoid average. An overview of the in- and around-ear electrode configurations is provided in Table 1. For graphical illustration, see Supplementary Material S1.

Table 1.

An Overview of the Investigated Electrode Configurations With Electrodes In and Around the Ear.

Configuration abbreviation	Active electrode(s)	Reference electrode(s)
Ear	12 ear-EEG electrodes	Average of active electrodes
AEar	Average left ear-EEG	Average right ear-EEG
M	M1	M2
T	T7	T8
EarT	12 ear-EEG electrodes, T7, T8	Average of active electrodes
AEarT	Average right ear-EEG, average left ear-EEG, T7, T8	Average of active electrodes
MT	M1, M2, T7, T8	Average of active electrodes
EarM	12 ear-EEG electrodes, M1, M2	Average of active electrodes
AEarM	Average right ear-EEG, average left ear-EEG, M1, M2	Average of active electrodes
EarMT	12 ear-EEG electrodes, M1, M2, T7, T8	Average of active electrodes
LAEarMT	Average left ear-EEG, M1, T7	Average of active electrodes
RAEarMT	Average right ear-EEG, M2, T8	Average of active electrodes
AEarMT	Average right ear-EEG, average left ear-EEG, M1, M2, T7, T8	Average of active electrodes

The abbreviation for the configuration is shown along with the active electrode(s) and the reference electrode(s).

Temporal Response Function

An encoding model was trained to predict the EEG response from the envelope of the presented stimuli. Leave-one-out cross-validation was applied within each condition, resulting in 16 encoder models trained per condition and participant. TRF estimation involves inversion of the stimulus property covariance matrix. To prevent overfitting, ridge regularization with a regularization parameter (λ) of 100 was applied across all TRF estimations. This λ value was used across all conditions and all participants, as it resulted in the highest overall mean reconstruction accuracy across participants and condition in a previous study including YNH participants, when testing λ values in the range [10⁻⁶ to 10⁵] (Borges et al., 2025) and ensures a fair comparison between models. The noise floor was determined by calculating the Pearson's correlation between the EEG data predicted from 68 audio excerpts, which were not presented to the participants, and the recorded EEG. These audio excerpts, which were extracted from the same audiobook material as the experimental speech stimuli, went through the same preprocessing as the experimental stimuli.

A subset of 17 channels was selected for subsequent TRF analysis (FC5, FC3, FC1, FCz, FC2, FC4, FC6, C3, C1, Cz, C2, C4, F3, F1, Fz, F2, and F4), as these channels demonstrated high prediction accuracy in previous studies (Fuglsang et al., 2017). A permutation test with α = 0.05 and a precision of 0.01 was used to test whether the mean prediction accuracy of the TRFs from the 17 channels significantly exceeded the mean noise floor within a condition. Only conditions for which a significant difference was observed at the participant level were included in the further TRF analysis. To find the amplitudes and latencies of the prominent peaks, identified manually as the distinctive peaks in the grand average TRF (P1, N1, and P2), a Gaussian function was fitted to each peak on the participant-level TRF for each noise condition. The Gaussian function was parameterized as: $a (t) = p * \exp (- ((t - l) / w)^{2})$ , where $a (t)$ is the amplitude as a function of time t, p is the amplitude, l is the latency, and w is the width of the deflection. This approach aligns with that of Borges et al. (2025), and the fit was performed using the same boundary parameters. Fits with R² values at or below 0.5 were rejected and excluded from further analysis.

Relationships between TRF amplitudes, latencies, and SNR were examined using linear mixed models (six models in total) in RStudio (R Core Team, Vienna, version 4.3.2) with the nlme package (Lindstrom & Bates, 1990; Pinheiro & Bates, 1996). A mixed model was chosen to account for inter-subject variability in repeated measures. The relationship was investigated using the model $F \sim S N R + G r o u p + (1 | P)$ , where F is the latency or amplitude to be predicted, SNR and Group are fixed effects, and P is the participant-specific offset, modeled as a random effect. The participant-specific slope was not included as a random variable, since the purpose was to find a general trend across participants. The residuals were analyzed for normality by visual inspection of the histogram and QQ plots. The significance of the fixed effects was assessed using a univariate Wald test with α = 0.05, followed by Holm–Bonferroni correction. Only the conditions containing noise were included in the statistical analysis whereas the clean speech condition was not considered.

To compare the mean prediction accuracies between the two age groups within each condition, a cluster permutation test was used to control for multiple comparisons across electrodes. This analysis was performed using the “ft_freqstatistics” function in FieldTrip (Oostenveld et al., 2011). The following settings were used: Monte-Carlo estimates of the significance probabilities, independent samples t-test statistics, a randomization of 1000 repetitions and a cluster α of 0.05. The “ft_prepare_neighbors” function was used to define the neighborhood structure using the triangulation method.

Speech Reception Threshold Estimation From EEG

The SRT_neuro was estimated using a linear decoder that was trained on the clean speech data to reconstruct the envelopes of the speech signals contained in the presented speech-and-noise stimuli (Borges et al., 2024). The reconstruction accuracy was calculated as the Pearson's correlation between the actual and the reconstructed envelopes. The noise floor was found using the same approach as used for the encoder. The reconstruction accuracy for the clean speech condition was evaluated using leave-one-out cross-validation, where one clean-speech excerpt was used for testing and 15 clean-speech excerpts were used for training. This procedure was repeated 16 times, with a different test trial in each iteration.

To investigate whether the mean reconstruction accuracy in the ONH individuals was significantly higher than that of the YNH individuals, as observed in Presacco et al. (2016), a permutation test was conducted. This test had a precision of 0.01 and a significance level of 5%. The null hypothesis was that the mean reconstruction accuracy of the ONH individuals was not higher than that of the YNH individuals. The alternative hypothesis was that the mean reconstruction accuracy in the ONH individuals was significantly higher than that of the YNH individuals.

A sigmoid function was then fitted to the reconstruction-accuracy-versus-SNR data points for each individual participant. The function used for the fit, as described by Farris-Trimble and McMurray (2013), is given by the following equation: $S (SNR) = \frac{p - b}{1 + \exp (4 \cdot \frac{s}{p - b} \cdot (m - SNR))} + b$ (1)

Here, p is the maximum value of the sigmoid, b is the minimum value, s is the slope, and m is the midpoint. The maximum value of the sigmoid was set to the mean value of the reconstruction accuracy obtained for the clean speech condition, and the minimum value was set to the mean reconstruction accuracy obtained for the noise floor. The slope and the midpoint of the sigmoid were estimated using a non-linear least-square fitting procedure with the “lsqcurvefit” function in MATLAB. For this approach to be valid, an increase in reconstruction accuracy as a function of increasing SNR is required. Therefore, a permutation test was performed to test whether the reconstruction accuracy values obtained for the conditions SRT_beh + 2 dB, SRT_beh + 4 dB and clean speech were significantly higher than the noise floor and the reconstruction accuracy values obtained for the conditions SRT_beh-4 dB, SRT_beh-2 dB, with $α$ = 0.05. A fit was conducted only if a significant increase in reconstruction accuracy was found. Furthermore, a fit was not performed if the mean of the reconstruction accuracy in more than two conditions was above the maximum value p (i.e., above the mean reconstruction accuracy obtained for clean speech) or if the mean of the reconstruction accuracy in more than two conditions was below the minimum value b (i.e., below the noise floor). The slope was restricted to positive values during the fitting procedure. The fitting procedure was repeated 100 times, and the mean values of the parameters obtained in the 10 fits with the highest R² values were used as the function parameters. The resulting SRT_neuro values outside the boundaries of [-40 + 40] dB were discarded, as they were deemed to be unrealistic estimations, which was the case for three participants in four different conditions (1.3% of the estimations across all electrode configurations and participants).

The SRT_neuro estimation was evaluated using three performance measures: i) the percentage of valid SRT_neuro estimations, where higher percentages indicate the method's applicability to more participants; ii) the number of participants with a difference between SRT_beh and SRT_neuro within ±3 dB, where higher numbers indicate better precision; and iii) the SD of the difference between SRT_beh and SRT_neuro, where lower values indicate better precision.

The performance of the SRT_neuro estimation was evaluated by calculating the difference between SRT_beh and SRT_neuro. To test whether the mean of the difference between SRT_beh and SRT_neuro was significantly different between the age groups, a permutation was conducted with a precision of 0.01 and a significance level of 5%. The null hypothesis was that there is no difference in the mean, and an alternative hypothesis was that there is a difference in the mean. To address whether there was a significant difference in the variance of these differences between the two age groups, a two-sampled F-test for equal variances was performed. The null hypothesis for the F-test was that the differences in the two groups come from a normal distribution with the same variance, and the alternative hypothesis was that the differences come from a normal distribution with different variance.

Results

The measured pure-tone thresholds for the ONH are shown in Figure 1a. The SRT_beh ranged from −5.4 dB to −4.0 dB, with a mean of −4.8 in the ONH group, see Figure 1b. To investigate age-related differences, this dataset was compared with data from 20 YNH participants collected in a previous study, in this group the SRT_beh ranged from −6.0 to −3.3 dB with a mean of −5.4 dB, see Figure 1b (for details see Borges et al., 2025).

Figure 1.

(A) Pure-Tone Thresholds Obtained in the Left and Right Ear for Each of the ONH Participant (Thin Lines) Along with the Mean and the Standard Deviation Across the Population (Bold Lines and Error Bars). The Inclusion Criteria for the “Better” Ear are Shown with a Red Dotted Line, and the Maximum Threshold Allowed for the Other Ear (10 dB Higher) is Shown as a Solid Red Line. (B) Boxplot of the Measured SRT_beh Values for Both ONH and YNH Participants. The Red Line Indicates the Median, the Edges of the Blue Box Represent the 25^th and 75^th Percentile, the Black Whiskers Mark the Most Extreme Values, and a Red Cross Marks Outliers Defined as a Value More than 1.5 Times the Interquartile Range Away from the Bottom or Top of the Box. SRT_beh Values for Individual Participants are Shown as Black Circles (Horizontally Jittered for Improved Readability).

In Figure 2a, the resulting grand average TRFs for the two age groups are shown. Higher overall amplitudes are observed for the ONH individuals (solid lines) compared to the YNH individuals (dotted lines), while no clear differences in latency are evident between the two groups. TRFs for the individual participants can be found in Supplementary Material S2. The number of conditions for which a given component was not included on the participant level, due to an R² for the Gaussian function being at or below 0.5 or the prediction accuracy not being significantly above the noise floor, is reported in Supplementary Material S3.

Figure 2.

(A) The Grand Average TRF from the ONH Individuals Shown by Solid Lines and YNH Individuals Shown by Dotted Lines in the −100 to 400 ms Time Window for All Six SNRs. (B) T-Values from the Cluster Permutation Test, Comparing the Mean Prediction Accuracies Between the Two Age Groups (ONH - YNH) within Each Condition at the Participant Level. Electrodes Included in the Significant Cluster are Marked with a Star Symbol.

Figure 2b shows topography plots of T-values from the cluster permutation test, comparing the mean prediction accuracies between the two age groups (ONH - YNH) for each SNR condition. A positive cluster is consistently present over all SNR conditions, centrally located for the ONH group compared to the YNH group. Results from the Wald test, summarized in Table 2, indicate that age group was a significant predictor of the amplitude of all components (P1, N1, P2), but not of latency. Moreover, SNR was found to be a significant predictor of the latency of all components and the amplitude of N1 and P2.

Table 2.

The Results From the Mixed Linear Model Analyzing TRF Amplitudes and Latencies Relative to SNR and Age Group, Along with the Corresponding Coefficients and Results From the Wald Test.

Component	Parameter	Fixed effect	Coefficient	t-value	Degrees of freedom	Std. error	p-Value
P1	Amplitude	SNR	0.004	0.695	101	0.006	0.4887
	Amplitude	Age Group	0.353	4.299	38	0.082	0.0001
	Latency	SNR	−1.476	−2.975	101	0.496	0.0037
	Latency	Age Group	7.068	2.074	38	3.408	0.0449
N1	Amplitude	SNR	−0.043	−6.276	101	0.007	< 10⁻⁴
	Amplitude	Age Group	−0.406	−3.239	38	0.126	0.0025
	Latency	SNR	−2.485	−10.345	101	0.240	< 10⁻⁴
	Latency	Age Group	1.494	0.426	38	3.503	0.6722
P2	Amplitude	SNR	0.039	6.404	101	0.006	< 10⁻⁴
	Amplitude	Age Group	0.273	2.781	38	0.098	0.0084
	Latency	SNR	−3.75288	−5.634	101	0.666	< 10⁻⁴
	Latency	Age Group	8.65695	1.761	38	4.916	0.0863

Statistically significant coefficients are shown in bold font.

An increased overall reconstruction accuracy for the ONH individuals can be observed in Figure 3. The permutation test, with the null hypotheses that the mean reconstruction accuracy of the ONH individuals is not larger than that of the YNH individuals, yielded a p-value of p = 0.0002. This result indicates that the ONH group exhibited significantly larger mean reconstruction accuracy compared to the YNH group.

Figure 3.

Reconstruction Accuracy for the YNH and ONH Groups Along with the Noise Floor for Each Group. The Mean for Each Group is Shown in Bold, and the Shaded Areas Represent ±1 Standard Deviation Around the Mean of the Reconstruction Accuracy for the Group.

Table 3 and Figure 4 show that all ONH individuals obtained an SRT_neuro within 3 dB of their SRT_beh when using the Scalp electrode configuration. The SD of the difference between the SRT_beh and the SRT_neuro was 1.2 dB, with a median of 0.1 dB. A similar trend was observed for the YNH individuals. For the Scalp configuration, the permutation test with the null hypothesis of no between-age-group difference in the mean of the difference between SRT_beh and SRT_neuro was non-significant (p = 0.7411), indicating no evidence of a difference in mean values and thus SRT estimation precision. Additionally, a two-sampled F-test for equal variances of the difference, with a null hypothesis of no difference in variance, was non-significant (p = 0.5818), suggesting no evidence of a difference in variance.

Figure 4.

The Upper Panel Shows a Bar Plot of the Percentage of Participants for Whom a Valid SRT_neuro was Obtained, with Data from the YNH Individuals Shown in Blue and Data from the ONH Individuals Shown in Red. The Lower Panel Depicts a Boxplot of the Difference Between SRT_beh and SRT_neuro; the Median is Shown as a Red Line, the 25^th and 75^th Percentiles are Indicated by the Boxes, and the Whiskers Indicate Extreme Values for Non-Outliers. The Participant-Specific Data Points are Depicted as Black Circles. Outliers are Defined as Datapoints more than 1.5 Times the Interquartile Range Away from the Top or Bottom of the Box and Marked with a Red Cross. 3-dB Limits are Shown as Green Dotted Lines.

Table 3.

Comparison of the Difference Between SRT_beh and SRT_neuro for the ONH and YNH Individuals.

	Younger normal-hearing individuals			Older normal-hearing individuals
	% of estimations	< 3 dB [%]	SD [dB]	% of estimations	< 3 dB [%]	SD [dB]
Ear	45	30	3.05	41	23	2.67
AEar	40	30	2.58	41	32	1.87
M	35	25	2.83	50	36	3.20
T	70	55	2.58	77	36	3.64
EarM	70	30	6.80	45	36	2.13
AEarM	55	35	4.21	55	32	2.63
EarT	95	55	6.75	77	64	3.75
AEarT	85	70	2.04	82	68	2.44
MT	90	85	1.65	91	86	1.86
EarMT	100	70	2.76	82	68	2.80
AEarMT	100	95	1.80	86	77	2.10
LAEarMT	65	50	4.58	86	73	2.10
RAEarMT	95	95	1.56	77	59	4.31
Scalp	100	95	1.37	100	100	1.21

The first column shows the percentage of valid SRT_neuro estimates out of all included participants in the age group. The following columns show the percentage of participants with a difference between SRT_beh and SRT_neuro within 3 dB out of all valid SRT_neuro estimates in the age group, and the SD of the difference between SRT_beh and SRT_neuro for the valid SRT_neuro estimates in the age group.

For the ONH individuals, SRT_neuro estimates based on in-ear EEG electrodes (Ear and AEar) were obtained for 41% of participants, with SDs of 2.7 dB (Ear) and 1.9 dB (AEar). The proportion of participants with SRT_neuro values within ±3 dB of SRT_beh was 23% for Ear and 32% for AEar. These results closely resembled those obtained for the YNH individuals. Estimations obtained with the mastoid electrodes only (M) were roughly similar. The percentage of SRT_neuro estimates increased substantially for the ONH group when using the temporal electrodes (T). However, in the temporal configuration (T) the number of ONH participants with an SRT_neuro within ±3 dB of the SRT_beh was similar to the in-ear (Ear and AEar) and equal to the mastoid (M) configuration, suggesting that the higher percentage of estimates obtained with the T configuration came at the expense of overall reduced precision.

When combining the electrodes from the in-ear EEG configurations (Ear and AEar) with mastoids (EarM and AEarM), the results were very similar to using the mastoid or in-ear electrodes separately for the ONH individuals. In contrast, combining the in-ear and mastoid electrodes (EarM) for the YNH individuals resulted in an increase of reliable SRT_neuro estimations as compared to the M and Ear configurations. However, the percentage of participants within 3 dB for the EarM configuration was similar to that obtained for the Ear and M configurations, and the SD increased for the EarM configuration compared to the M and Ear configurations. When comparing the AEarM to the separate configurations (AEar and M) in the YNH individuals, a similar trend was observed, with an increase in reliable SRT_neuro estimations, but a higher SD in AEarM compared to AEar and M.

When combining electrodes from the Ear configurations (AEar and Ear) with the temporal electrodes, the number of estimates roughly doubled, accompanied by a doubling also in the number of ONH participants with an SRT_neuro within 3 dB difference of the SRT_beh. Furthermore, the SD decreased compared to the T configuration but increased compared to the Ear and AEar configurations. A similar trend was found in the YNH individuals. When combining the mastoid electrodes and the temporal electrodes (MT), very similar results for the ONH and YNH individuals were obtained, with a better SRT_neuro estimation compared to configurations using the temporal and mastoid electrodes separately (M and T).

The electrode configurations combining the in-ear electrodes and the mastoid and temporal electrodes (EarMT and AEarMT) did not improve the SRT_neuro estimation of ONH participants compared to using the MT configuration. For the YNH participants, on the other hand, there was an increase in SRT_neuro estimates within 3 dB difference of SRT_beh and the amount of reliable SRT_neuro estimates for the AEarMT configuration compared to MT, as well as a minor increase in SD. For EarMT an increase in reliable SRT_neuro estimates was found, but also an increase of SD and a decrease in the number of SRT_neuro within ±3 dB of SRT_beh.

Using only electrodes from one side of the head for the ONH individuals revealed similar results for the left side (LAEarMT) compared to both sides (AEarMT), while a slight decrease in SRT_neuro estimation quality was observed for the right side (RAEarMT) compared to both sides (AEarMT).

For the YNH individuals, the right side (RAEarMT) showed only minor differences in estimation quality compared to both sides (AEarMT), whereas the left side (LAEarMT) yielded lower estimation quality than when using electrodes from both sides (AEarMT). In the YNH individuals, RAEarMT performed almost identical to the Scalp configuration. This was not the case for the ONH individuals where the best performing side (LAEarMT) yielded a lower overall number of SRT_neuro estimates (86% vs. 100%), fewer SRT_neuro estimates within 3 dB difference from the measured SRT_beh (73% vs. 100%), and a higher SD of the difference between SRT_neuro and SRT_beh (2.1 dB vs. 1.2 dB) compared to the Scalp configuration.

Discussion

Summary of the Main Results

A statistically significant increase in stimulus reconstruction accuracy was observed for the ONH compared to the YNH individuals, see Figure 3, along with enhanced EEG prediction accuracy in the EEG fronto-centrally for the ONH individuals compared to the YNH individuals, see Figure 2b. The ONH group had enhanced amplitude of both P1, N1, and P2 relative to the YNH group (i.e., a positive fixed effect for P1 and P2 and a negative effect for N1, Table 2). The SNR of the presented stimuli was a significant predictor of the latency of all components, and for the amplitude of N1 and P2.

Regarding SRT estimation, no statistically significant difference between the two age groups was found for the mean and variance of the differences between SRT_beh and SRT_neuro when using the Scalp configuration, indicating that there was no difference in SRT estimation quality across the two groups. In the in- and around-ear configurations, the difference between SRT_beh and SRT_neuro was generally small between the two age groups, with some exceptions: (i) the SRT estimations for YNH individuals improved more when temporal electrodes were used as compared to in-ear electrodes than for ONH individuals, (ii) an increase in the number of SRT_neuro estimates was observed when using the EarM configuration compared to M and Ear in the YNH individuals but not in the ONH individuals, (iii) when using electrodes from only one side of the head, the best SRT_neuro estimate for the ONH individuals was obtained using the left side (LAEarMT configuration), whereas the best estimate for the YNH individuals was obtained using the right side (RAEarMT configuration), (iv) the estimation quality for the AEarMT/RAEarMT configuration was similar to the Scalp configuration (i.e., very high) for YNH individuals whereas it was slightly reduced as compared to the Scalp configuration for the ONH individuals.

Age-Related Differences in Reconstruction Accuracy and TRFs

The TRFs in Figure 2a show an increase in amplitude in the ONH individuals compared to the YNH individuals, which was confirmed in the statistical test showing that age group was a significant predictor of amplitude for all components (P1, N1, and P2). This increase of amplitude, and thereby SNR of the EEG signal, likely contributed to the enhanced reconstruction accuracy seen in the ONH individuals compared to the YNH individuals. The SNR of the stimuli were a significant predictor of the latency of all components and of the amplitude of N1 and P2. Caution is advised when interpreting changes in the latency and amplitude of peaks and troughs in the TRF waveform, as these features do not directly reflect the amplitude and latency of the underlying neural components. For instance, a change in the amplitude of a neural component can affect the latencies and amplitudes of multiple features in the TRF waveform, (Luck & Kappenman, 2012). Furthermore, this study cannot specify the underlying causes of the observed differences, as the analysis conducted here only confirms their existence. Identifying the true causation of the changes would require additional experiments and is beyond the scope of the present study. However, an increasing latency with decreasing SNR may suggest an increase of processing, potentially delaying the neural response due to the additional neural processing required in low-SNR conditions. The decrease in N1 and P2 amplitude magnitude with lower SNR likely reflects the TRF encoder's challenge in tracking the speech envelope. As the speech envelope becomes masked by additive noise, the neural tracking of the speech envelope deteriorates, resulting in a lower TRF amplitude magnitude. In this study, no evidence was found to support the SNR as a reliable predictor of the P1 amplitude, suggesting that the P1 amplitude as a response to the speech stimulus envelope is less affected by SNR levels.

Enhanced envelope reconstruction accuracy in ONH individuals compared to YNH individuals has been observed in previous studies (Decruy et al., 2019; Karunathilake et al., 2023; McClaskey, 2024; Presacco et al., 2016) as well as increased TRF peak amplitudes (Karunathilake et al., 2023; Panela et al., 2024). However, in contrast to the present results, prediction accuracy has been found to decrease with age (Gillis et al., 2023), but this could be due to methodological differences, that is, using spectrogram and acoustic onset instead of envelope. This enhancement of the auditory response in ONH has been speculated to result from an excitatory/inhibitory imbalance (Alain et al., 2014), resulting in cortical hyperactivity in the auditory cortex including increased spontaneous neural firing, increased synchronization amongst neurons, and enhanced sound evoked responses (Herrmann & Butler, 2021).

Yet there is evidence that it could be a cortical effect of hearing loss rather than increasing age when it comes to neural fundamental-frequency tracking (Van Canneyt et al., 2021). Animal models have furthermore shown frequency-specific increases in spontaneous neuronal firing rate following noise exposure, linking the hyperactivity to hearing loss (Eggermont, 2015; Eggermont & Tass, 2015; Seki & Eggermont, 2003). Similar hyperactivity has also been observed in aging animal models in the absence of noise exposure. In these cases, age-related inner-ear dysfunctions such as degeneration of hair cells, the stria vascularis, and spiral ganglion cells are believed to underlie the observed hyperactivity (Bao & Ohlemiller, 2010; Dubno et al., 2013; Gratton & Vázquez, 2003; Keithley, 2020; Moore, 1987; Plack, 2014; Schmiedt, 2010).

Another potential explanation is recruitment of additional cortical regions to process the same stimuli in ONH, compensating for a reduction in the specialized processing regions (Brodbeck et al., 2018; Peelle et al., 2010). Karunathilake et al. (2023) also investigated the latency changes of M50_trf, M100_trf, and M200_trf, that is, the MEG counterparts of the P1, N1, and P2 deflections discussed in the current study. They found significant noise-related delays in latencies of these components, which support the findings of the current study. It is important to note that the study by Karunathilake et al. (2023) utilized babble noise, whereas the present study employed steady-state speech-shaped noise. It is promising for the application of the SRT_neuro method that some of the same neurological changes were also found when using more naturalistic noise such as babble noise in the stimuli. Other studies have also reported that the latencies of TRF deflections decrease and their amplitudes increase in magnitude with higher SNR levels in YNH individuals using MEG (Ding & Simon, 2013) and in preschool children using EEG (Van Hirtum et al., 2023).

Age-Related Differences in SRT_neuro Estimation

There was no statistical evidence for an age-group difference in mean or variance of the differences between SRT_beh and SRT_neuro when using the Scalp configuration, despite the overall increase in reconstruction accuracy in the ONH individuals compared to the YNH individuals, as shown in Figure 3. This suggests that the mean value of the fitted sigmoid function (the SRT_neuro) remains unaffected by changes in overall reconstruction accuracy, which is a highly desirable characteristic of the SRT_neuro estimation method. Furthermore, the observation that the variance of the estimate is also independent of reconstruction accuracy level demonstrates the robustness of the estimation method to variability in reconstruction accuracies across individuals. A possible explanation for this robustness lies in the nature of the sigmoid fitting procedure, which takes the individual reconstruction accuracy level into account, which can be considered as a form of normalization of the neural tracking strength of the individual participant. However, it is somewhat unclear why an improvement in reconstruction accuracy does not translate into a higher percentage of reliably estimated SRT values.

In the current study, the SRT_neuro obtained from electrodes in and around the ears showed comparable results for the two age groups, with a few exceptions (see Figure 4 and Table 3). For the YNH individuals, the number of reliable SRT_neuro estimates increased when using the EarM configuration compared to using M or Ear. There was no increase in the number of YNH participants with an SRT_neuro within ±3 dB difference of the SRT_beh, and the SD of the difference between SRT_neuro and SRT_beh increased. This suggests that the increase in reliable SRT_neuro estimates came at the expense of the overall precision of the estimates. This trend was not observed in the ONH individuals, indicating that the difference observed between YNH and ONH may be due to random fluctuations. The YNH individuals showed a more substantial benefit from using only temporal electrodes compared to in-ear electrodes than the ONH individuals. This could be due to an enhanced response in the ONH individuals in the areas outside the core auditory cortex (Brodbeck et al., 2018). When the response area is broader, neighboring channels may capture more synchronized activity, resulting in reduced additional information in the T electrodes, due to the recorded activity being similar between T electrodes and in-ear electrodes.

When using electrodes from only one side of the head, the best SRT_neuro estimates in the ONH individuals were found on the left side (LAEarMT configuration), whereas for the YNH individuals they were found for the right side (RAEarMT). A study by Brodbeck et al. (2018) compared the prediction accuracy of the envelope for ONH and YNH and found that there was an increased prediction accuracy in ONH and that it was particularly pronounced in the left temporal lobe. This increased activity on the left side of the head with age could explain the benefit of estimating SRT_neuro based on left-side electrodes for the ONH individuals but not for the YNH, as observed in the current study. This is further supported by the fact that the cluster permutation test in the current study revealed enhanced prediction accuracy for the ONH in a fronto-central area across all SNR conditions, with more electrodes in the cluster around the left ear compared to the right, see Figure 2b. In the current study, using only electrodes from the “better” side of the head for the ONH individuals did not yield SRT_neuro estimation performance comparable to that obtained when using all scalp electrodes, whereas for the YNH individuals this was the case. This could be due to enhanced recruitment of neurons in the areas close to the temporal lobe in ONH resulting in a smaller difference in the potential measured by the neighboring channel for the ONH compared to the YNH individuals (Brodbeck et al., 2018).

In the current study, the SNRs were chosen in 2 dB steps around the SRT_beh. Previous work (Borges et al., 2025) investigated whether this SNR selection strategy biased the SRT_neuro estimation. To assess this, the same SRT_neuro estimation methods were applied to data sets simulated based on the same underlying function but sampled for different SNR ranges. In particular, the SNR range was moved by −4 dB to 4 dB in 1 dB intervals (9 distinct SNR sets). This analysis showed no evidence of the SNR selection biasing the SRT_neuro estimation.

Application

SRT_neuro provides a continuous measure of the SRT, offering new opportunities for real-world assessment and adaptation in hearing care. Logging of SRT_neuro during daily-life situations could inform optimization of hearing-aid performance and support personalized rehabilitation strategies. Furthermore, continuous assessment of SRT_neuro based on uncontrolled natural speech could enable hearing aids to dynamically adjust their performance in real time to optimize the user's speech intelligibility.

The present study shows that the SRT_neuro can be estimated based on in-ear-EEG alone in the ONH individuals with similar precision as in YNH individuals, especially when also including electrodes around the ear. The SRT_neuro estimation was independent of age group even though higher reconstruction accuracies were observed in the ONH individuals. This is an advantage for an automatic SRT_neuro estimation, as it suggests that the estimation method is robust with respect to effects of age on reconstruction accuracy and therefore does not need to be specifically tailored to different age groups. The SRT_neuro can be estimated using electrodes in and around the ear from one side of the head, with slightly lower precision in the ONH individuals compared to using all scalp electrodes, but with the same precision as obtained for the full scalp configuration in the YNH individuals. However, the better side for precise SRT_neuro estimation when only using electrodes in and around the ear from one side differed between ONH and YNH groups. If a SRT_neuro measurement platform was to be used across age groups there are thus three options: (i) electrodes from both ears could be used for the SRT_neuro estimation, (ii) the better ear for the participant could be identified and used for the SRT_neuro estimation, or (iii) the ear used for the SRT_neuro estimation could be determined by age. Since hearing impairment is associated with a higher age (Gratton & Vázquez, 2003), and in the current study the ONH group obtained more precise SRT_neuro estimation when using electrodes from the left side, it is likely that a left sided electrodes placement would yield good results for most hearing-aid users. Using electrodes from only one side of the head allows to obtain the SRT_neuro without connecting the two hearing aids and would therefore be more applicable. Another solution that does not require a connection between the two hearing aids could be the use of two different reference systems (one for each side) as input for the SRT_neuro estimation; this method has not been explored in the current study.

Applying the proposed method in an actual clinical context, where the behavioral SRT (SRT_beh) is unknown, the method would likely require sampling across a broader range of SNRs to adequately capture the informative portion of the underlying sigmoid function. This would come at the cost of increased measurement duration.

In the ONH population, only the Scalp electrode configuration yielded an SRT_neuro estimate for all participants. This is due to the quality requirements implemented for the sigmoid fit and reconstruction accuracy datapoints. If ear-EEG electrode configurations were to be implemented in practice, these quality requirements should be revisited to ensure that SRT_neuro fits are only conducted when the reconstruction accuracy is reliably tracked. Furthermore, if the increase in reconstruction accuracy is not reliably tracked, additional data could be recorded and included until this is the case.

The method would likely be improved by training the model on more data, since the standard error of the mean generally decreases with $\frac{1}{\sqrt{n}}$ (Kirkwood & Sterne, 2009; Mesik & Wojtczak, 2022; Wilroth et al., 2023) thus, conducting more trials would most likely increase the number of SRT values that can be accurately estimated. Moreover, adding more predictors such as phoneme onsets and spectrogram to the model would most likely also lead to an increase of accurately estimated SRT values (Lesenfants et al., 2019). If the model is further improved, this could enhance the usability of estimating SRT_neuro with electrodes in and around the ear. If the SRT_neuro is estimated in a hearing aid, the hearing-aid signal processing could adapt dynamically to improve speech understanding for the individual hearing-aid user when necessary. The SRT value could also be logged in the user's natural environment and thus support rehabilitation. Estimating the SRT_neuro in a real-life setting would most likely require more data than when testing in a laboratory setting. This is not necessarily a drawback since much more data could be collected outside of a laboratory setting, and ear-EEG allows long-term and discreet monitoring (Kidmose et al., 2012). Furthermore, electrodes positioned in and around the ear could offer additional benefits for hearing-aid feedback, including decoding the attended speaker (Alickovic et al., 2019; Fiedler et al., 2017; Mirkovic et al., 2016; Nguyen et al., 2025; Rotaru et al., 2024; Tanveer et al., 2024) and estimating hearing thresholds (Bech Christensen et al., 2018; Christensen et al., 2018; Sergeeva et al., 2024) and other audiometric features.

Limitations

The SRT_neuro estimation relies on the envelope-following response in the EEG rather than on speech-intelligibility scores. Here it is important to note that the envelope following response reflects an encoding of acoustic features, not necessarily comprehension. While the two measures are highly correlated (Ding & Simon, 2013; Iotzov & Parra, 2019; Shannon et al., 1995; Vanthornhout et al., 2018), an envelope-following response is likely a prerequisite—rather than sufficient per se—for speech intelligibility. For example, SRT_neuro could potentially still be estimated when participants listen to stimuli containing speech in an unfamiliar language, where speech understanding is absent.

The study is limited by the amount of data collected. Given that in-ear EEG is well-suited for long-term monitoring, the collection of additional data should enhance the decoding model and improve the precision of the SRT_neuro estimation.

If the SRT_neuro was implemented in a hearing aid, many unknown factors could influence the estimation, such as different types of noise variations in room acoustics and acoustic features of the target speech. Furthermore, the method has not been explored in hearing-impaired individuals, where a larger variation in the SRTs is expected, therefore, to strengthen the generalizability of the study it would be a logical next step to conduct a study with hearing-impaired individuals. This would allow for validation across a wider SRT range and provide insights into individual differences. In the current study, the YNH and ONH groups had relatively balanced hearing threshold levels, and the behavioral SRTs measured in the two groups were very similar with little interindividual variation. However, it should be noted that differences in speech intelligibility and other psychoacoustic measures between younger and older listeners with normal hearing are expected (see Goossens et al., 2017; 53-54; Regev, Oxenham, et al., 2025; Regev, Zaar, et al., 2025; Working Group on Speech Understanding & Aging, 1988), although these differences may not necessarily emerge in the conditions used in the current study (speech mixed with speech-shaped noise).

It is unclear how left/right imbalances in hearing loss may affect the quality of the SRT_neuro estimation from each side. A study by Presacco et al. (2019) compared the reconstruction accuracy of the envelope between ONH individuals and older hearing-impaired individuals, finding no significant differences between the two populations. However, one study found that elevated hearing thresholds and impaired speech intelligibility were associated with an increased correlation between the EEG and the amplitude envelope of the presented stimuli (Schmitt et al., 2022), and an increase in cortical responses to sound in hearing-impaired compared to older adults has also been observed (Alain et al., 2014; Millman et al., 2017; Tremblay et al., 2003). In the current study, enhanced reconstruction accuracy did not have an effect on the SRT_neuro estimation quality. However, changes in neural activation due to hearing loss may interact differently with the SRT estimation method used in the current study than changes in neural activation due to age.

The current study demonstrates the feasibility of estimating SRT_neuro using electrodes in and around the ear. While integrating advanced technology such as ear-EEG into hearing aids could pave the way for neuro-steered hearing aids, with SRT_neuro estimation serving as a concrete example of the potential opportunities, incorporating such technologies also introduces challenges that must be carefully balanced against user benefits and economic costs. Several of these challenges, though beyond the scope of the current study, are worth highlighting: (i) increased power consumption in an already power-constrained device; (ii) limited physical space for integrating electrodes and supporting electronics; (iii) vulnerability to electromagnetic interference from both internal and external sources; (iv) reduced control over recording conditions in real-world settings, where environmental noise may affect the SNR; (v) privacy and ethical concerns related to continuous EEG recording. Such considerations are crucial for the successful implementation and acceptance of neuro-steered hearing aids.

Conclusion

The ONH individuals showed similar SRTs to their YNH peers, while also exhibiting an overall increase in envelope reconstruction accuracy. However the precision of the SRT_neuro estimate did not significantly differ across the two age groups. For scalp EEG, the SRT_neuro was estimated with good precision in all participants in both the YNH and ONH groups. When restricting the estimation to in-the-ear electrodes, the number of individuals with an estimated SRT_neuro decreased to 45% and 41% for the YNH and ONH groups, respectively. When combining in-the-ear and around-the-ear electrodes, the maximum percentage of individuals with an estimated SRT_neuro was 100% for the YNH and 86% for the ONH. An analysis of spatiotemporal responses through the TRF revealed that the ONH group exhibited increased amplitudes of the P1 (∼50 ms), N1 (∼120 ms) and P2 (∼200 ms) deflections compared to the YNH group. TRF latencies decreased with increasing SNR, while the amplitudes of the N1 and P2 deflections increased as the SNR increased. Overall, these findings demonstrate the robustness of the SRT_neuro estimation method with regard to age-related changes in neural speech-envelope tracking.

Supplemental Material

sj-pdf-1-tia-10.1177_23312165251372462 - Supplemental material for Age-Related Differences in EEG-Based Speech Reception Threshold Estimation Using Scalp and Ear-EEG

Supplemental material, sj-pdf-1-tia-10.1177_23312165251372462 for Age-Related Differences in EEG-Based Speech Reception Threshold Estimation Using Scalp and Ear-EEG by Heidi B Borges, EminaAlickovic, Christian B Christensen, Preben Kidmose and Johannes Zaar in Hearing

Footnotes

Acknowledgments

The authors would like to thank all the participants who dedicated their time to be part of the study. They would also like to thank Alberte Hygum Valsted,Sven-Gustav Thiesen and Josefine Hjort for their support during data collection,Jesper Trolle and Ingelise Nielsen for their invaluable assistance in producing the ear-EEG molds,and Lorenz Fiedler for his assistance with the statistical analysis. Lastly,they thank the William Demant Foundation for making this study possible.

ORCID iDs

Heidi B Borges

Emina Alickovic

Christian B Christensen

Preben Kidmose

Johannes Zaar

Ethical Considerations

The experimental protocols were approved by the Institutional Review Board (IRB) of Aarhus University (Approval number 2023-014) on September 29,2023. Informed consent was obtained prior to participation. The participants were given the option to refuse to participate by opting out at any point of the study. The participants were given the option to withdraw their data prior to anonymization.

Consent to Participate

Informed consent from participants was obtained in writing.

Consent for Publication

The informational material provided to participants prior to obtaining their written consent stated that the data would be published in scientific journal articles.

Authors’ contributions

Heidi B Borges did conceptualization,methodology,software,formal analysis,investigation,data curation,writing—original draft,visualization,funding acquisition. Emina Alickovic did conceptualization,methodology,software,writing—review and editing,supervision,funding acquisition. Christian B. Christensen did conceptualization,methodology,resources,writing—review and editing,supervision. Preben Kidmose did conceptualization,methodology,writing—review and editing,supervision,project administration,funding acquisition. Johannes Zaar did conceptualization,methodology,software,writing—review and editing,supervision,project administration,funding acquisition.

Funding

The authors disclosed receipt of the following financial support for the research,authorship,and/or publication of this article: This work was supported by the William Demant Foundation [Grant number 21-2912].

Declaration of Conflicting Interests

The authors declared no potential conflicts of interest with respect to the research,authorship,and/or publication of this article.

Data Availability Statement

Data can be provided upon reasonable request.

Supplemental Material

Supplemental material for this paper is available online.

References

Alain

Roye

Salloum

(2014). Effects of age-related hearing loss and background noise on neuromagnetic activity from auditory cortex. Frontiers in Systems Neuroscience, 8, 8. https://doi.org/10.3389/fnsys.2014.00008

Alickovic

Lunner

Gustafsson

Ljung

(2019). A tutorial on auditory attention identification methods. Frontiers in Neuroscience, 13, 153. https://doi.org/10.3389/fnins.2019.00153

Bao

Ohlemiller

K. K.

(2010). Age-related loss of spiral ganglion neurons. Hearing Research, 264(1–2), 93–97. https://doi.org/10.1016/j.heares.2009.10.009

Bech Christensen

Hietkamp

R. K.

Harte

J. M.

Lunner

Kidmose

(2018). Toward EEG-assisted hearing aids: Objective threshold estimation based on EarEEG in subjects with sensorineural hearing loss. Trends in Hearing, 22, 1–13. https://doi.org/10.1177/2331216518816203

Biesmans

Das

Francart

Bertrand

(2017). Auditory-inspired speech envelope extraction methods for improved EEG-based auditory attention detection in a cocktail party scenario. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 25(5), 402–412. https://doi.org/10.1109/TNSRE.2016.2571900

Borges

H. B.

Zaar

Alickovic

Christensen

C. B.

Kidmose

(2024). The speech reception threshold can be estimated using EEG electrodes in and around the ear. bioRxiv, 2024.12.02.625819. https://doi.org/10.1101/2024.12.02.625819

Borges

H. B.

Zaar

Alickovic

Christensen

C. B.

Kidmose

(2025). Speech reception threshold estimation via EEG-based continuous speech envelope reconstruction. European Journal of Neuroscience, 61(6), e70083. https://doi.org/10.1111/ejn.70083

Brodbeck

Presacco

Anderson

Simon

J. Z.

(2018). Over-representation of speech in older adults originates from early response in higher order auditory cortex. Acta Acustica United With Acustica, 104(5), 774–777. https://doi.org/10.3813/AAA.919221

Christensen

C. B.

Harte

J. M.

Lunner

Kidmose

(2018). Ear-EEG-based objective hearing threshold estimation evaluated on normal hearing subjects. IEEE Transactions on Bio-Medical Engineering, 65(5), 1026–1034. https://doi.org/10.1109/TBME.2017.2737700

10.

Crosse

M. J.

Di Liberto

G. M.

Bednar

Lalor

E. C.

(2016). The multivariate temporal response function (mTRF) toolbox: A MATLAB toolbox for relating neural signals to continuous stimuli. Frontiers in Human Neuroscience, 10, https://doi.org/10.3389/fnhum.2016.00604

11.

Daneman

Carpenter

P. A.

(1980). Individual differences in working memory and reading. Journal of Verbal Learning and Verbal Behavior, 19(4), 450–466. https://doi.org/10.1016/S0022-5371(80)90312-6

12.

Decruy

Vanthornhout

Francart

(2019). Evidence for enhanced neural tracking of the speech envelope underlying age-related speech-in-noise difficulties. Journal of Neurophysiology, 122(2), 601–615. https://doi.org/10.1152/jn.00687.2018

13.

Ding

Simon

J. Z.

(2013). Adaptive temporal encoding leads to a background-insensitive cortical representation of speech. The Journal of Neuroscience, 33(13), 5728–5735. https://doi.org/10.1523/JNEUROSCI.5297-12.2013

14.

Dubno

J. R.

Eckert

M. A.

Lee

F.-S.

Matthews

L. J.

Schmiedt

R. A.

(2013). Classifying human audiometric phenotypes of age-related hearing loss from animal models. Journal of the Association for Research in Otolaryngology, 14(5), 687–701. https://doi.org/10.1007/s10162-013-0396-x

15.

Eggermont

J. J.

(2015). Animal models of spontaneous activity in the healthy and impaired auditory system. Frontiers in Neural Circuits, 9, https://doi.org/10.3389/fncir.2015.00019

16.

Eggermont

J. J.

Tass

P. A.

(2015). Maladaptive neural synchrony in tinnitus: Origin and restoration. Frontiers in Neurology, 6, https://doi.org/10.3389/fneur.2015.00029

17.

Farris-Trimble

McMurray

(2013). Test–retest reliability of eye tracking in the visual world paradigm for the study of real-time spoken word recognition. Journal of Speech, Language, and Hearing Research, 56(4), 1328–1345. https://doi.org/10.1044/1092-4388(2012/12-0145)

18.

Fiedler

Wöstmann

Graversen

Brandmeyer

Lunner

Obleser

(2017). Single-channel in-ear-EEG detects the focus of auditory attention to concurrent tone streams and mixed speech. Journal of Neural Engineering, 14(3), 036020. https://doi.org/10.1088/1741-2552/aa66dd

19.

Fuglsang

S. A.

Dau

Hjortkjær

(2017). Noise-robust cortical tracking of attended speech in real-world acoustic scenes. NeuroImage, 156, 435–444. https://doi.org/10.1016/j.neuroimage.2017.04.026

20.

Gillis

Kries

Vandermosten

Francart

(2023). Neural tracking of linguistic and acoustic speech representations decreases with advancing age. NeuroImage, 267, 119841. https://doi.org/10.1016/j.neuroimage.2022.119841

21.

Goossens

Vercammen

Wouters

van Wieringen

(2017). Masked speech perception across the adult lifespan: Impact of age and hearing impairment. Hearing Research, 344, 109–124. https://doi.org/10.1016/j.heares.2016.11.004

22.

Gratton

M. A.

Vázquez

A. E.

(2003). Age-related hearing loss: Current research. Current Opinion in Otolaryngology & Head and Neck Surgery, 11(5), 367–371. https://doi.org/10.1097/00020840-200310000-00010

23.

Herrmann

Butler

B. E.

(2021). Hearing loss and brain plasticity: The hyperactivity phenomenon. Brain Structure and Function, 226(7), 2019–2039. https://doi.org/10.1007/s00429-021-02313-9

24.

Iotzov

Parra

L. C.

(2019). EEG can predict speech intelligibility. Journal of Neural Engineering, 16(3), 036008. https://doi.org/10.1088/1741-2552/ab07fe

25.

Kappel

S. L.

Kidmose

(2022). Characterization of dry-contact EEG electrodes and an empirical comparison of Ag/AgCl and IrO2 electrodes. Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual International Conference, 2022, 3127–3130. https://doi.org/10.1109/EMBC48229.2022.9871923

26.

Kappel

S. L.

Rank

M. L.

Toft

H. O.

Andersen

Kidmose

(2019). Dry-contact electrode ear-EEG. IEEE Transactions on Bio-Medical Engineering, 66(1), 150–158. https://doi.org/10.1109/TBME.2018.2835778

27.

Karunathilake

I. M. D.

Dunlap

J. L.

Perera

Presacco

Decruy

Anderson

Kuchinsky

S. E.

Simon

J. Z.

(2023). Effects of aging on cortical representations of continuous speech. Journal of Neurophysiology, 129(6), 1359–1377. https://doi.org/10.1152/jn.00356.2022

28.

Keithley

E. M.

(2020). Pathology and mechanisms of cochlear aging. Journal of Neuroscience Research, 98(9), 1674–1684. https://doi.org/10.1002/jnr.24439

29.

Kidmose

Looney

Mandic

D. P.

(2012). Auditory evoked responses from ear-EEG recordings. Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual International Conference, 2012, 586–589. https://doi.org/10.1109/EMBC.2012.6345999

30.

Kidmose

Looney

Ungstrup

Rank

M. L.

Mandic

D. P.

(2013). A study of evoked potentials from ear-EEG. IEEE Transactions on Bio-Medical Engineering, 60(10), 2824–2830. https://doi.org/10.1109/TBME.2013.2264956

31.

Kirkwood

B. R.

Sterne

J. A. C.

(2009). Essential medical statistics (2nd ed., [Nachdruck]). Blackwell Science.

32.

Kochkin

(2002). Consumers rate improvements sought in hearing instruments. Hearing Review, 9(11), 18–22. https://hearingreview.com/hearing-products/marketrak-vi-consumers-rate-improvements-soughtnbsp-in-hearing-instruments

33.

Lesenfants

Vanthornhout

Verschueren

Decruy

Francart

(2019). Predicting individual speech intelligibility from the cortical tracking of acoustic- and phonetic-level speech representations. Hearing Research, 380, 1–9. https://doi.org/10.1016/j.heares.2019.05.006

34.

Lindstrom

M. L.

Bates

D. M.

(1990). Nonlinear mixed effects models for repeated measures data. Biometrics, 46(3), 673–687. https://doi.org/10.2307/2532087

35.

Luck

S. J.

(2014). An introduction to the event-related potential technique (2nd ed.). MIT Press.

36.

Luck

S. J.

Kappenman

E. S.

(Eds.). (2012). The Oxford handbook of event-related potential components. Oxford University Press.

37.

McClaskey

C. M.

(2024). Neural hyperactivity and altered envelope encoding in the central auditory system: Changes with advanced age and hearing loss. Hearing Research, 442, 108945. https://doi.org/10.1016/j.heares.2023.108945

38.

Mesik

Wojtczak

(2022). The effects of data quantity on performance of temporal response function analyses of natural speech processing. Frontiers in Neuroscience, 16, 963629. https://doi.org/10.3389/fnins.2022.963629

39.

Millman

R. E.

Mattys

S. L.

Gouws

A. D.

Prendergast

(2017). Magnified neural envelope coding predicts deficits in speech perception in noise. The Journal of Neuroscience, 37(32), 7727–7736. https://doi.org/10.1523/JNEUROSCI.2722-16.2017

40.

Mirkovic

Bleichner

M. G.

De Vos

Debener

(2016). Target speaker detection with concealed EEG around the ear. Frontiers in Neuroscience, 10, 349. https://doi.org/10.3389/fnins.2016.00349

41.

Moore

D. R.

(1987). Physiology of higher auditory system. British Medical Bulletin, 43(4), 856–870. https://doi.org/10.1093/oxfordjournals.bmb.a072222

42.

Nguyen

N. D. T.

Mikkelsen

Kidmose

(2025). Cognitive component of auditory attention to natural speech events. Frontiers in Human Neuroscience, 18, https://doi.org/10.3389/fnhum.2024.1460139

43.

Nielsen

J. B.

Dau

(2011). The Danish hearing in noise test. International Journal of Audiology, 50(3), 202–208. https://doi.org/10.3109/14992027.2010.524254

44.

Oldfield

R. C.

(1971). The assessment and analysis of handedness: The Edinburgh inventory. Neuropsychologia, 9(1), 97–113. https://doi.org/10.1016/0028-3932(71)90067-4

45.

Oostenveld

Fries

Maris

Schoffelen

J.-M.

(2011). Fieldtrip: Open source software for advanced analysis of MEG, EEG, and invasive electrophysiological data. Computational Intelligence and Neuroscience, 2011, 1–9. https://doi.org/10.1155/2011/156869

46.

Panela

R. A.

Copelli

Herrmann

(2024). Reliability and generalizability of neural speech tracking in younger and older adults. Neurobiology of Aging, 134, 165–180. https://doi.org/10.1016/j.neurobiolaging.2023.11.007

47.

Peelle

J. E.

Troiani

Wingfield

Grossman

(2010). Neural processing during older adults’ comprehension of spoken sentences: Age differences in resource allocation and connectivity. Cerebral Cortex, 20(4), 773–782. https://doi.org/10.1093/cercor/bhp142

48.

Pinheiro

J. C.

Bates

D. M.

(1996). Unconstrained parametrizations for variance–covariance matrices. Statistics and Computing, 6, 289–296. https://doi.org/10.1007/BF00140873

49.

Plack

C. J.

(2014). The sense of hearing. Psychology Press.

50.

Presacco

Simon

J. Z.

Anderson

(2016). Evidence of degraded representation of speech in noise, in the aging midbrain and cortex. Journal of Neurophysiology, 116(5), 2346–2355. https://doi.org/10.1152/jn.00372.2016

51.

Presacco

Simon

J. Z.

Anderson

(2019). Speech-in-noise representation in the aging midbrain and cortex: Effects of hearing loss. PLoS One, 14(3), e0213899. https://doi.org/10.1371/journal.pone.0213899

52.

Regev

Oxenham

A. J.

Relaño-Iborra

Zaar

Dau

(2025a). Evaluating the role of age on speech-in-noise perception based primarily on temporal envelope information. Hearing Research, 460, 109236. https://doi.org/10.1016/j.heares.2025.109236

53.

Regev

Relaño-Iborra

Zaar

Dau

(2024). Disentangling the effects of hearing loss and age on amplitude modulation frequency selectivity. The Journal of the Acoustical Society of America, 155(4), 2589–2602. https://doi.org/10.1121/10.0025541

54.

Regev

Zaar

Relaño-Iborra

Dau

(2023). Age-related reduction of amplitude modulation frequency selectivity. The Journal of the Acoustical Society of America, 153(4), 2298. https://doi.org/10.1121/10.0017835

55.

Regev

Zaar

Relaño-Iborra

Dau

(2025). Investigating the effects of age and hearing loss on speech intelligibility and amplitude modulation frequency selectivity. The Journal of the Acoustical Society of America, 157(3), 2077–2090. https://doi.org/10.1121/10.0036220

56.

Rotaru

Geirnaert

Heintz

Van de Ryck

Bertrand

Francart

(2024). What are we really decoding? Unveiling biases in EEG-based decoding of the spatial focus of auditory attention. Journal of Neural Engineering, 21(1), 016017. https://doi.org/10.1088/1741-2552/ad2214

57.

Schmiedt

R. A.

(2010). The physiology of cochlear presbycusis. In Gordon-Salant

Frisina

R. D.

Popper

A. N.

Fay

R. R.

(Eds.), The aging auditory system (Vol. 34, pp. 9–38). Springer New York.

58.

Schmitt

Meyer

Giroud

(2022). Better speech-in-noise comprehension is associated with enhanced neural speech tracking in older adults with hearing impairment. Cortex, 151, 133–146. https://doi.org/10.1016/j.cortex.2022.02.017

59.

Seki

Eggermont

J. J.

(2003). Changes in spontaneous firing rate and neural synchrony in cat primary auditory cortex after localized tone-induced hearing loss. Hearing Research, 180(1–2), 28–38. https://doi.org/10.1016/S0378-5955(03)00074-1

60.

Sergeeva

Christensen

C. B.

Kidmose

(2024). Towards ASSR-based hearing assessment using natural sounds. Journal of Neural Engineering, 21(2), 026045. https://doi.org/10.1088/1741-2552/ad3b6b

61.

Shannon

R. V.

Zeng

F. G.

Kamath

Wygonski

Ekelid

(1995). Speech recognition with primarily temporal cues. Science, 270(5234), 303–304. https://doi.org/10.1126/science.270.5234.303

62.

Tanveer

M. A.

Skoglund

M. A.

Bernhardsson

Alickovic

(2024). Deep learning-based auditory attention decoding in listeners with hearing impairment. Journal of Neural Engineering, 21(3), 036022. https://doi.org/10.1088/1741-2552/ad49d7

63.

Tremblay

K. L.

Piskosz

Souza

(2003). Effects of age and age-related hearing loss on the neural representation of speech cues. Clinical Neurophysiology, 114(7), 1332–1343. https://doi.org/10.1016/S1388-2457(03)00114-7

64.

Van Canneyt

Wouters

Francart

(2021). Cortical compensation for hearing loss, but not age, in neural tracking of the fundamental frequency of the voice. Journal of Neurophysiology, 126(3), 791–802. https://doi.org/10.1152/jn.00156.2021

65.

Van Hirtum

Somers

Verschueren

Dieudonné

Francart

(2023). Delta-band neural envelope tracking predicts speech intelligibility in noise in preschoolers. Hearing Research, 434, 108785. https://doi.org/10.1016/j.heares.2023.108785

66.

Vanthornhout

Decruy

Wouters

Simon

J. Z.

Francart

(2018). Speech intelligibility predicted from neural entrainment of the speech envelope. Journal of the Association for Research in Otolaryngology, 19(2), 181–191. https://doi.org/10.1007/s10162-018-0654-z

67.

Wilroth

Bernhardsson

Heskebeck

Skoglund

M. A.

Bergeling

Alickovic

(2023). Improving EEG-based decoding of the locus of auditory attention through domain adaptation. Journal of Neural Engineering, 20(6), 066022 . https://doi.org/10.1088/1741-2552/ad0e7b

68.

Working Group on Speech Understanding & Aging . (1988). Speech understanding and aging. The Journal of the Acoustical Society of America, 83(3), 859–895. https://doi.org/10.1121/1.395965

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

1.16 MB

0.00 MB

Age-Related Differences in EEG-Based Speech Reception Threshold Estimation Using Scalp and Ear-EEG

Abstract

Keywords

Introduction

Materials and Methods

Participants

Experimental Setup

Behaviorally Estimated Speech Reception Threshold—SRTbeh

Experimental Paradigm and Setup for Speech Reception Threshold Estimated From EEG—SRTneuro

Analysis

Stimulus Preprocessing

EEG Preprocessing

Temporal Response Function

Speech Reception Threshold Estimation From EEG

Results

Discussion

Summary of the Main Results

Age-Related Differences in Reconstruction Accuracy and TRFs

Age-Related Differences in SRTneuro Estimation

Application

Limitations

Conclusion

Supplemental Material

sj-pdf-1-tia-10.1177_23312165251372462 - Supplemental material for Age-Related Differences in EEG-Based Speech Reception Threshold Estimation Using Scalp and Ear-EEG

Footnotes

Acknowledgments

ORCID iDs

Ethical Considerations

Consent to Participate

Consent for Publication

Authors’ contributions

Funding

Declaration of Conflicting Interests

Data Availability Statement

Supplemental Material

References

Supplementary Material

Behaviorally Estimated Speech Reception Threshold—SRT_beh

Experimental Paradigm and Setup for Speech Reception Threshold Estimated From EEG—SRT_neuro

Age-Related Differences in SRT_neuro Estimation