Abstract
Introduction
Noise reduction (NR) algorithms have been used in hearing aids (HAs) as methods to reduce undesired noise since the mid-1990s. The two main approaches exploit (a) spectral and temporal differences between target signal and noise, and (b) the spatial separation between target signal and noise sources. Aiming at speech as the target signal, there has been substantial technological development of both approaches over the last 30 years starting from simple algorithms with two microphones that suppress lateral noise sources (Kollmeier et al., 1993) to increasingly sophisticated approaches that include deep learning techniques (Andersen et al., 2021), combining spectral and spatial approaches.
For hearing-impaired adults, evaluated based on group averages, spectral NR algorithms in commercially available HAs did not until recently show consistent speech intelligibility benefit. Significant improvements were shown only for acceptable noise levels, accompanied by moderate improvements in sound quality (see Lakshmi et al., 2021 for a review). The most recent NR algorithms based on deep neural networks, however, offer a small but significant improvement in speech intelligibility in noise (Andersen et al., 2021). The difficulty of improving speech intelligibility with NR is often attributed to the introduction of audible artifacts in the amplified HA output, which may counteract the signal-to-noise ratio (SNR) benefit. In contrast, spatial NR algorithms (i.e., “beamforming” or “directional microphones, DIR),” can improve speech intelligibility substantially in certain spatial scenarios, where target speech and noise sources are spatially separated (Kidd et al., 2015; Picou et al., 2014; Valente & Mispagel, 2008). Furthermore, DIR can reduce listening effort (Andersen et al., 2021; Desjardins, 2016) for HA users. In addition to this benefit compared to amplification alone, DIR improves the ease of listening in specific acoustic scenarios and is associated with an increase in HA usage time (Bentler, 2005). Speech intelligibility benefits have also been found in mild reverberation, but at the expense of poorer performance in sound localization for the most lateral target sounds (Picou et al., 2014).
Despite clear trends at the group level, there is large interindividual variability in speech intelligibility benefit from DIR + NR across patients (Ricketts & Mueller, 2000; Zaar et al., 2024a), with some patients benefitting a lot and some only marginally, and very few even showing poorer speech intelligibility with specific settings (Neher, 2014). Currently, there is no widely accepted individual prescription rule for DIR or NR strength based on patient-specific factors comparable to, for instance, those for prescribing HA amplification based on the audiogram (e.g., National Acoustics Labs-NonLinear 2 fitting formula [NAL-NL2]; Keidser et al., 2011). Instead, a one-size-fits-all medium strength is often suggested by HA manufacturers via the fitting software, and the audiologist is left with finding optimal NR strength settings in a time-intensive trial-and-error process with the individual patient.
There has been some research trying to identify a relationship between the speech intelligibility benefit from DIR or NR and individual patient-related factors. Investigations with an early DIR algorithm showed no correlation of benefit from DIR with pure-tone audiometric information, in particular with the slope of the audiogram and the amount of high-frequency hearing loss (Ricketts & Mueller, 2000). Also, no correlation was found between age (Wu, 2010) or extent of DIR usage in everyday life (Cord et al., 2004) and benefit from DIR. However, more recent findings (Zaar et al., 2024a) showed correlations between DIR + NR benefit and both audiometric thresholds and performance in a lengthy psychoacoustic spectro-temporal modulation detection (STMD) task that can be used as a language-independent proxy for speech-in-noise measurements in realistic environments. STMD performance is a viable candidate for reflecting individual (suprathreshold) audible contrast. Therefore, the hypothesis is reasonable that the higher the contrast loss, the stronger the potential benefit from DIR + NR, since DIR + NR processing increases the contrast between target signal and background noise. Given that there now exist more elaborate versions of DIR + NR algorithms than in the previous studies, as well as a clinically applicable version of the STMD test (the audible contrast threshold test [ACT™]; Zaar et al., 2024b), these relations need to be revisited.
An important factor to consider is the closedness of the acoustic coupling. From open domes to completely closed individually manufactured earmolds without vents, there is a large variety of possibilities to couple the HA-processed sound into the patient's ear canal, which all have their advantages and disadvantages (see Winkler et al., 2016 for a review). Although, to the authors’ knowledge, there does not exist a study investigating the relationship between the closedness of acoustic coupling and DIR or NR benefit on an individual basis, there is some evidence in the literature about the implications of the closedness of acoustic coupling for the individually achieved DIR or NR benefit. The directivity index, a measure of the effectiveness of DIR, was shown to decrease with increasing vent size (Ricketts, 2000), presumably because more closed fittings allow more of the processed and less of the unprocessed ambient sound to reach the eardrum of the patient. Such directivity index differences can be used to predict average (group) differences in speech reception threshold (SRT) benefit (Ricketts et al., 2005), that is, the difference between SRTs measured with and without DIR + NR. The speech intelligibility benefit from DIR in spatial acoustic scenarios persists (Keidser et al., 2007) even when vent-transmitted sound dominates the low frequencies. However, the DIR benefit is lower for open fittings than for closed fittings (Klemp & Dhar, 2008; Magnusson et al., 2013). Thus, intentionally increasing vent size has an acoustic effect and may decrease DIR benefit. Ricketts et al. (2005), Keidser et al. (2007), Klemp and Dhar (2008), and Magnusson et al. (2013) found this effect at the group level without specifically measuring the closedness of acoustic coupling in the individual ears. However, Cubick et al. (2022) showed that there is considerable variability across individuals in terms of closedness of acoustic coupling assessed with real-ear measurements (REMs), even with the same type of instant fit ear tip, which was attributed to the different fits of these noncustomized coupling solutions to the individual ear canal shape. Also, different types of individually customized earmolds and vent sizes are known to have large effects on the closedness of acoustic coupling assessed with REM (Denk et al., 2023). It is therefore highly desirable to reinvestigate the relation of closedness of acoustic coupling (measured with REM) to DIR + NR benefit in individual HA users.
The goal of this paper was to investigate the predictive value of four patient-specific factors and their combination for the speech-in-noise benefit that patients obtain from strong state-of-the-art commercially available DIR + NR in HAs. The factors investigated were the audiogram, audible contrast thresholds (ACT)-value, age, and the individually measured closedness of the acoustic coupling. The test–retest reliability of each of these measures was assessed to give a realistic estimation of how much individual variance may be explainable given the accuracy of the audiological measures.
Methods
Participants
A total of 123 hearing-impaired individuals (52 female, 71 male) participated in this study. Eighty-two of the participants (26 female, 56 male) were native speakers of German who were recruited and tested at the University of Applied Sciences, Lübeck (Lübeck, Germany). The remaining 41 participants (26 female, 15 male) were native speakers of Japanese who were recruited and tested at OTO Clinic Tokyo (Tokyo, Japan). The average age was 65.2 years (
Pure-Tone Audiometry
All participants underwent otoscopy and standard clinical audiometry (in Germany: Affinity 2.0 audiometer, Interacoustics, Middelfart, Denmark; in Japan: AA-H1 or AA-M1A audiometer, Rion Co. Ltd, Tokyo, Japan) measuring air-conduction pure-tone thresholds at least at frequencies of 0.25, 0.5, 1, 2, 4, and 8 kHz. Figure 1 shows the pure-tone thresholds of all participants for the right and left ears, with hearing losses ranging from mild to severe. One hundred and nine participants showed across-ear threshold differences ≤15 dB for at least four out of the six measured audiometric frequencies, indicating rather symmetric hearing losses, whereas the remaining 14 participants exceeded that limit, indicating a higher degree of asymmetry. Binaural pure-tone averages over the thresholds at 0.5, 1, 2, and 4 kHz (BPTA4) and across ears were used as one predictor variable for the SRT benefit from DIR + NR.

Pure-tone thresholds obtained for the right and left ears of the 123 individual participants (thin lines) along with the mean (bold line) and standard deviation (shaded area) across participants.
HA Fitting
All participants were provided with two receiver-in-the-canal Oticon More 1 HAs, which constituted the highest technological level of signal processing in the portfolio of the HA manufacturer Oticon at the time of testing, including advanced DIR + NR (Andersen et al., 2021). All participants were fitted with these HAs according to procedures typical for their home country:
The 82 German participants were fitted according to the NAL-NL2 gain rationale (Keidser et al., 2011) for “experienced” users by professionally trained and certified HA acousticians using Genie 2.0 fitting software (Oticon, Smørum, Denmark). Their acoustic coupling was chosen as prescribed by the Oticon Genie 2 fitting software, which resulted in a large variety of couplings across participants from open bass domes to customized earmolds with different vent sizes. Note that this choice of acoustic coupling creates a mix of individual patient differences (e.g., due to the audiogram) with intentional changes in the type of acoustic coupling within the German participants. Amplification was verified using REM implemented within Affinity 2.0 and adjusted accordingly by means of the REM AutoFit feature within Genie 2 using the international speech test signal (ISTS; Holube et al., 2010). The goal of this verification was that differences between targets and real-ear-insertion gains for input levels of 50, 65, and 80 dB were <5 dB.
The 41 Japanese participants underwent the Utsunomiya method of auditory rehabilitation for gain adjustment (Shinden et al., 2021; Yamada et al., 2020) and were fitted following the guidelines defined by the Japan Audiological Society (Kodera et al., 2016). Following local practice, they received nonvented customized earmolds, and electroacoustic analysis using the Rion AA-H1 or AA-M1A was used to assess amplification. Verification based on REM Autofit showed that the Utsunomiya method fits provided sufficient audibility, as the prescribed gains typically exceeded the gains prescribed by NAL-NL2 for frequencies between 0.5 and 4 kHz (Suzuki et al., 2023).
In both countries, no gradual gain adaptation over time was used. Two DIR + NR settings with identical gain settings were defined using Genie 2, and single-blindly (with regard to the participants) saved as two programs in their HAs: “DIR + NROff” (DIR + NR algorithm inactive, HA directivity pattern in omnidirectional mode with pinna compensation only) and “DIR + NRstrong” (strongest adaptive DIR + NR setting available).
Closedness of Acoustic Coupling Measurements
Real-ear occluded insertion gain (REOIG) was measured for both ears of each participant as proposed by Cubick et al. (2022). Participants were seated in a double-walled sound-insulated booth facing the loudspeaker of the audiometer. The in-situ headset was attached to the ears of the patient and in-situ tubes were inserted into the ear canal and positioned <6 mm from the ear drum, following the guidance from the REM software. First, real-ear unaided gains (REUGs) were determined using pink noise (in Germany) or the ISTS (in Japan) at 70 dB sound pressure level (SPL) using the Affinity Suite software and REM module. Then, the HA including acoustic coupling was carefully put in place, avoiding moving the in-situ tube. Real-ear occluded gains (REOGs) were then determined using the same noise and the exact same procedure as for the REUG, while the HA was switched off. REOIGs as a function of frequency, f, were then calculated as

Average (black lines) and individual (gray lines) real-ear occluded insertion gain (REOIGs) for each ear of each participant sorted according to earpiece type.
Audible Contrast Threshold Test (ACT™)
The ACT™ was administered by the same trained HA acousticians who fitted the HA in Germany, and by medical technologists in Japan, as described in Zaar et al. (2024b) using customized software executed under Matlab (Mathworks, Natick, MA, USA) on a Windows-based PC. Stimuli were digital/analog converted using an RME (Haimhausen, Germany) Fireface UC soundcard, amplified using a Lake People (Konstanz, Germany) G103 headphone amplifier, and presented via RadioEar (Middelfart, Denmark) DD450 headphones. The participants’ responses were collected using a pushbutton fed through a customized converter box and recorded through an input channel of the RME Fireface UC sound card. Bursts of carrier noise (bandwidth: 354–2,000 Hz) with 1 s duration (including 125 ms raised-cosine fade in/outs) were played successively, while a modulation target was imposed on one of the bursts at a selected normalized contrast level (nCL) when the tester pressed the corresponding button in the test software. The participants were instructed to press the response button when they heard a “siren-like” sound within the “sound of the waves” according to the guidelines provided by Zaar et al. (2024b). The procedure included individual amplification of the stimuli based on the pure-tone audiogram such that the 1/3-octave-band levels in the stimulus frequency band were at least 15 dB above the hearing threshold with a maximum of 90 dB SPL; see Zaar et al. (2024b) for details. Participants were seated in a sound-insulated listening booth with headphones on and a pushbutton in hand. Starting with three target presentations at the highest contrast level of 16 dB nCL, an adaptive threshold tracking procedure similar to that used in pure-tone audiometry (Hughson & Westlake, 1944) was employed using a step size of 2 dB. The Hughson-Westlake tracking procedure employed a three-out-of-five stopping rule, counting among the last maximally five correctly detected targets from ascending sequences. If a test run was deemed inconsistent (typically when extending beyond 25 target presentations), the run was terminated and then repeated. Following a successful run, the ACT value was calculated based on all responses obtained in the run using a logistic function. For participants who were unable to do the task, one step above the highest contrast level of the procedure (i.e., 18 dB nCL) was stored as the result. For more details on the ACT™ procedure and data postprocessing, the reader is referred to Zaar et al. (2024b) and the research leading to it (Zaar et al., 2023a, 2024a). It is possible to categorize patients based on ACT value, audiogram data, and age (all language-independent tests) into groups with good, fair, and poor speech-in-noise ability, as done by Laugesen and Santurette (2023). However, this was not done in the present study, as the focus was to assess correlations of SRT benefit with each of these measures alone and in combination.
After one familiarization run with the ACT, two runs were conducted in order to assess the test–retest reliability of the ACT™. Test–retest reliability was quantified using the root-mean-square error (RMSE) of two observations (Bland & Altman, 1996), that is,
Speech Intelligibility Test
SRTs were assessed using the everyday sentences material of the hearing in noise test (HINT) in German (Joiko et al., 2020) for the German-speaking participants, and in Japanese (Shiroma et al., 2008) for the Japanese-speaking participants, both spoken by male speakers. Three active Genelec (Iisalmi, Finland) 8020D loudspeakers were positioned in a quiet room with living-room-like acoustics (Germany) or a sound-isolated listening booth (Rion Co. Ltd, type AT-81) along a circle with a diameter of 2.5 m (Germany) or 2 m (Japan) at azimuth angles of 0° and ±100°. The participants were seated in a chair positioned in the center of the circle, facing 0° azimuth and their ears were at the same height as the loudspeakers. They were instructed to maintain a static head position and asked to verbally repeat the target-sentence words they had understood into a microphone. The responses were manually scored by trained HA acousticians (Germany) or medical technologists (Japan) who were native speakers of German and Japanese, respectively. The speech test was run using dedicated HINT software from Interacoustics A/S (Middelfart, Denmark) on a PC. All sounds were played through an RME Fireface UC soundcard at a sampling rate of 44.1 kHz. The talkback microphone was routed to the soundcard's headphone output such that the audiologists/medical technologists could listen to the responses via headphones. The target speech was presented from the front at 65 dB SPL(C) measured at the center of the circle. Running speech maskers, spoken by two different male talkers and mixed with low-level speech-shaped noise (−6 dB relative to the running speech level), were played via the two loudspeakers at ±100° (one competing talker per loudspeaker and each with independent speech-shaped noise). SRTs at the 50%-sentences-correct level were tracked by adjusting the masker levels (and thus the SNR) according to sentence-correct scoring (Nilsson et al., 1994), where the first sentence was presented at −4 dB SNR and repeated in increasing SNR steps of 4 dB until all words in the presented sentence were correctly identified. For the next two to four sentences, the SNR was increased/decreased by 4 dB after an incorrect/correct response, respectively. Then, an average across the SNRs used in the previous four presentations was calculated and applied to the next sentence; the SNR was adjusted from there in steps of 2 dB for the remaining sentences. The resulting data (% correct words at different SNRs) were analyzed using the method of Rønne et al. (2017) to obtain more accurate SRTs corresponding to 50% sentences correct. This was done only for the German HINT results, as the Japanese HINT test by Shiroma et al. (2008) is constructed only for sentence scoring. The normative SRTs obtained with speech-shaped stationary noise using headphone presentation in the conditions “noise left” and “noise right” are, on average, −13.7 dB SNR for the German HINT (Joiko et al., 2020, their Table 1) and −12.4 dB SNR for the Japanese HINT (Shiroma et al., 2008, their Table 1). To align the results from the two populations, the resulting difference of 1.3 dB in normative performance was subtracted from the SRTs collected for the Japanese population. Two training runs were conducted with DIR + NRoff using one HINT list (20 sentences) in the first run and two HINT lists (40 sentences) in the second run to familiarize the participant with the procedure. After that, SRTs were assessed with DIR + NRoff and DIR + NRstrong using two HINT lists each (40 sentences) in balanced order across participants, and blinded to them. SRT benefit from DIR + NR processing was calculated as
Study Protocol
The research reported here was part of a larger study that tested several different DIR + NR settings and amplification settings for benefit and preference in the laboratory and in the field. This paper reports solely on laboratory measurements taken during two visits that were at least 6 months apart. The general sequence of the study was that pure-tone audiometry was conducted in the first visit. If instant ear tips were going to be used, as was the case with most (52) of the German participants, the HA fitting took place within the same session. If customized earmolds needed to be manufactured, as with all Japanese participants and with 30 German participants, ear impressions were taken and participants returned to the laboratory for a second visit to receive the HA fitting, once the customized earmolds were manufactured. An accommodation period of at least 2 weeks with the HA with two programs DIR + NRoff and DIR + NRstrong followed, to accustom participants to the new HAs and amplification. Participants were encouraged to sometimes switch between programs during the accommodation period. The ACT (test and retest) and HINT measurements were conducted during the visit after the accommodation period. Participants then went into the field phase, that is, six field periods of at least 4 weeks each, where different DIR + NR settings and different amplification settings were used (two programs per period, single-blinded to the participants) and participants indicated their overall and situation-specific preference. In the last visit, the ACT (test and retest) was repeated, and HINT measurements were repeated, which was possible with 115 participants, since eight participants dropped out of the study for personal reasons. REOIG measurements were done for 112 participants.
Results
Closedness of Acoustic Coupling Measurements
Figure 2 shows REOIG as a function of frequency for each ear of each participant sorted according to earpiece type (gray lines) and average REOIG for each earpiece type (black lines). Most REOIGs were flat around 0 dB for frequencies up to about 500 Hz and showed different amounts of attenuation at higher frequencies. Open bass domes showed the smallest amount of attenuation, typically <10 dB, and also relatively small variation across participants. The three dome types (top row of Figure 2) showed maximal attenuation between 2 and 3 kHz. Venting affected the average attenuation, which can be observed by comparing “grip tip 2.4 mm vent” with “grip tip no vent,” and “power mold with vent” with “power mold no vent.” While this held for the average REOIG for each earpiece type, there was extensive variability across participants, especially within the frequency region 1–4 kHz. The lack of attenuation at low frequencies is discussed later.
ACT Values
Figure 3 shows all ACT values obtained for test and retest measurements before and after the field phase. Dashed lines connect the ACT values of the same participant. Most of the dashed lines were roughly horizontal, indicating the stability of the ACT values. This was confirmed by the intraindividual variability measure RMSE (equation (2)): for the prefield session, the RMSE was 1.2 dB, the same as for the postfield session. For average prefield versus average postfield ACT, RMSE was 1.4 dB. A clinically relevant scenario would be that only a single ACT value per session would be obtained. Comparing only the first measurement done for the prefield session with the first measurement for the postfield the RMSE was 1.8 dB.

Audible contrast threshold (ACT) values obtained for individual participants for testing and retesting both prefield and postfield (for each participant connected using dashed lines). Boxes range from the 25th to 75th percentile, the median is denoted as a horizontal line, and whiskers denote ranges of 1.5 times the box length.
Speech-in-Noise Performance and DIR+NR Benefit
Figure 4a shows boxplots of single-participant SRTs for the unaided condition and with HA with DIR + NRoff and DIR + NRstrong, both before and after the field phase. In addition, the normal-hearing 95% confidence range (average ± two times the standard deviation) for the exact same setup, noises, and speech material as the German participants experienced is plotted as a gray area (data from Steffen, 2022). Note that SRTs could not always be measured in the unaided condition, because for 23 participants unaided speech intelligibility in quiet was so poor that the adaptive procedure of the HINT did not converge. Therefore, the boxplot of SRTs for the unaided condition indicates better quantiles (only of those for whom SRTs were measurable). SRTs were not normally distributed according to the Shapiro–Wilk test, so a nonparametric Wilcoxon rank sum test was used for statistical comparison. There was a significant improvement in SRT from unaided to DIR + NRoff (difference of medians: 0.7 dB,

(a) Boxplots and single-participant data (German: circles; Japanese: triangles) of SRTs in noise for all participants for whom SRTs were measurable in the unaided condition, and for all participants aided with HA in DIR + NRoff and DIR + NRstrong conditions, which are shown before and after the field phase, (b) SRT benefit before and after the field phase.
Contributions of Audiological Variables to SRT Benefit
Correlation analyses using Pearson's correlation coefficient

Scatter plot showing individual SRT benefit (averaged across prefield and postfield sessions) versus individual average SII-weighted REOIG for all participants. German participants’ pseudonyms start with “G,” and Japanese participants’ with “J.”
This indicates that even participants with completely open acoustic coupling (average SII-weighted REOIG near 0 dB) received, on average, an SRT benefit from DIR + NR of 3.2 dB in this acoustic scenario. Furthermore, for each 1 dB more closed fit acoustic coupling, 0.24 dB more SRT benefit was obtained, on average. Figure 5 provides another perspective on REOIGs for different acoustic coupling types. While most open couplings (open bass dome, red circles) cluster tightly on the right-hand side, there is a large spread of avREOIGs across the same acoustic coupling type. For example, the “Grip tip no vent” avREOIG (blue circles) ranges from −2 dB (almost open) to beyond −9 dB (modestly closed).
The second-strongest single-value correlation with SRT benefit was for the ACT values (
A generalized linear regression model with output SRT benefit modeled just from avREOIG (i.e., SRT benefit ∼ 1+ avREOIG) showed both intercept (

Results of regression analyses with SRT benefit as the outcome variable. The colored bars show the variance in SRT benefit explained individually and jointly by the factors avREOIG, ACT, BPTA4, and age.
Since part of the variance of the SRT benefit originates from the limited test–retest reliability of the SRT measurements, it is instructive to assess the maximal possible explainable variance, given the test–retest accuracy. This was done using bootstrapping with 1 million datasets of the same randomness (normal distribution) as the SRT benefit dataset with RMSE = 1.3 dB. This resulted in an average
To get a more realistic estimate of the maximum SRT benefit variance explained by the most prominent factor avREOIG, another bootstrapping was done under the assumption that the correlation between avREOIG and SRT benefit was perfect (i.e.,
Discussion
The present study investigated the correlation of four audiological patient-specific factors (and their combination) with the SRT benefit that an individual obtains from strong DIR + NR processing, as compared to no DIR + NR processing, in a state-of-the-art HA. The strongest correlation was with the closedness of acoustic coupling specified by the average SII-weighted REOIG, followed by the ACT value and the audiogram (in the form of BPTA4). Age was not correlated with SRT benefit and did not increase the percentage of SRT benefit variation explained, when used in combination with either other factor.
Closedness of Acoustic Coupling
The individual (frequency-specific) REOIG of the instant ear tips in the present study are comparable to those measured by Blau et al. (2008) and Cubick et al. (2022), both on average and in terms of individual variability, despite the different brands used (Phonak and WSA in their studies, respectively, vs. Oticon here). The large individual variability of REOIG is at least partly also due to test–retest reliability. The Bland-Altman data of Cubick et al. (2022) indicate, however, highly consistent REOIGs across six repetitions. The derived avREOIG is therefore most likely also highly consistent across repetitions, at least for the instant ear tips that Cubick et al. (2022) used in their study. Furthermore, the REOIGs of individually manufactured earmolds qualitatively agree with the published data of Blau et al. (2008) and Denk et al. (2023). A quantitative difference exists when comparing REOIGs of closed earmolds without vent (Figure 2, bottom right panel) at low frequencies, that is, below 250 Hz. Here, Blau et al. (2008) and Denk et al. (2023) show attenuation values in their REOIGs of about 10 dB for the most closed variant, whereas the average value at 250 Hz in the present study is close to 0 dB. The reason for these differences is most likely of a methodological nature. Denk et al. (2023) used an additional bore within their earmolds for the REM probe tube microphone, which exactly fitted the diameter of the probe tube, whereas the present study placed the probe tube between the ear canal wall and the earmold, creating potentially a small slit leak venting. The difference may not be negligible for the power molds without vent and the avREOIG values for these most closed acoustic couplings may therefore be an underestimator of their actual closedness.
Cubick et al. (2022) showed that REOIGs are stable over time and that measured vent effects and occlusion ratings are somewhat related (with more open venting leading, on average, to less perceived occlusion), but this was also found to be highly variable across participants. The benefits of more closed acoustic coupling are lower feedback thresholds (Blau et al., 2008; Winkler et al., 2016), higher aided gains at low frequencies (Ueno et al., 2021), higher SNRs with DIR (Bentler et al., 2006), and increased speech-in-noise benefit from DIR + NR (shown here). Since these benefits go alongside potential disadvantages, such as a substantially higher perceived occlusion (Denk et al., 2023) and dull perception of the user's own voice, the optimal tradeoff between too-closed and too-open may differ across individuals. The data of the present study suggest prescribing the minimally acceptable amount of venting if the goal is to maximize SRT benefit. Such a minimally acceptable amount of venting may be found for example by incorporating tympanometry results based on the data of Carle et al. (2002), who found a strong correlation between middle ear compliance and minimum acceptable equivalent vent size.
Presumably, one big contributor to the high correlation of SRT benefit with avREOIG is the intentional choice of different acoustic coupling types across the German participants based on the fitting software recommendation. While choosing different acoustic coupling types (based on the audiologist's recommendation in agreement with the preference of the HA user) is the practice in many countries, this creates a confounding of intentional differences in closedness of acoustic coupling and individual patient differences for the present study, which needs to be acknowledged here.
ACT Values
The ACT value was found to be a highly reproducible, easy-to-conduct, and fast audiometric test that was the second-strongest factor correlated with the SRT benefit. However, in comparison to Zaar et al. (2024a), the amount of SRT benefit variation explained by the ACT value alone (10.8%) is much smaller than the variation of SRT benefit explained in their study by the ACT's predecessor STMD test (51%). Reasons for this discrepancy may be (a) the difference in DIR + NR processing compared to the former study (which had improved substantially in the HAs tested in the present study), (b) the difference between the clinical ACT™ and the STMD test, and (3) the use of only 30 patients in Zaar et al. (2024a) compared to more than 120 here. Another possible reason for the difference in explained variance may be a larger variety of acoustic couplings in the present study, including 41 individually fitted earmolds without vents (vs. only software-prescribed domes or earmolds in Zaar et al., 2024a). Less likely reasons may be different amplification settings and different native languages of the participants in their study.
Country-specific analyses were done in order to narrow down the reasons for the discrepancies. The Japanese participants all had the same type of earpiece (earmolds without vent), and the same amplification strategy, and participant numbers were close to those of Zaar et al. (2024a). An analysis of data for the Japanese participants resulted in a correlation coefficient of
Since ACT values are a better predictor of aided performance without DIR + NR (Zaar et al., 2023b) than the audiogram, and ACT values added to the explained SRT benefit variance using avREOIG similarly as the factor BPTA4, the following interpretation may be valid: For predicting the SRT benefit, that is, the SRT difference between unprocessed and DIR + NR processed conditions, the ACT value may not be needed if the audiogram was already measured. However, for the prediction of absolute SRT(DIR + NRoff) and SRT(DIR + NRstrong) the ACT value is needed. Therefore, the use of ACT values for a language-independent prediction of the SRT with and without DIR + NR can be recommended based on the data of the present study. The reasons why both ACT values and standard audiometry have the same predictive value for the SRT benefit are currently not clear. With the standard parameters of the ACT stimulus, that is, a bandwidth of 354 to 2000 Hz, two cycles per octave spectral, and 4 Hz temporal modulation, it is likely that temporal fine structure processing is reflected in the ACT value rather than spectral resolution (cf. Mehraei et al., 2014).
Other Factors
BPTA4 was a weaker (but still significant) predictor of the SRT benefit than avREOIG. Instead of the BPTA4, using the better-ear PTA4 for each participant (based upon the motivation that better ear listening may be responsible for the SRT benefit, cf. Williges et al., 2019), resulted in only minor differences in the correlation with SRT benefit (
The results for the subgroup of Japanese participants can be used to at least partly separate the intentional variations in acoustic coupling from “real” patient-specific differences, because all Japanese participants had the same type of earpiece. Limiting correlation analysis to the Japanese participants showed no significant correlation between SRT benefit and avREOIG (
Ricketts and Mueller (2000) found a significant correlation between the benefit of directional HA processing with aided omnidirectional performance. The corresponding correlation was also highly significant for the present data (
Another audiological factor that was not tested here, but that may be related to SRT benefit, is the individual ability to process speech binaurally, which is measurable with the binaural intelligibility level difference (BILD or binaural squelch). Neher et al. (2017) found that the BILD is an indicator of how much a person benefits from strong bilateral beamforming, that is, DIR processing exploiting the signals from both left and right HA. This appears plausible, as people with strong internal binaural NR processing (as described by the model of Beutelmann & Brand, 2006) may be hampered by DIR processing which may reduce the availability of binaural cues. These people may thus benefit less from DIR + NR than people whose internal NR is suboptimal, such as in cases of asymmetric hearing loss (Neher et al., 2017). However, since the vast majority of the participants of the present study had fairly symmetric hearing losses, this was deemed not to be a strong factor here. Cognitive factors, such as short-term memory (e.g., assessed using the reading span test, as in Neher, 2014) were found to be related to SRTs with DIR + NR processing. Another cognitive measure, the reverse digit span score, was related to SRTs in NRoff (Zaar et al., 2024a). However, cognitive factors were not assessed in the present study. Therefore, cognitive factors should be assessed in future studies to try to explain the remaining variance in SRT benefit.
SRT Benefit
With a median of 4.7 dB and a maximum of 10.8 dB, the SRT benefit was considerable in the present study. These large benefits are most likely dominated by the DIR processing rather than the NR processing, since the spatial acoustic scenario used here provides a particular advantage for DIR processing. A small contribution of the (neural network-based) NR processing to the SRT benefit can, however, be expected, since the same NR algorithm without DIR processing showed significant improvements in speech intelligibility (Andersen et al., 2021). Quantification of the relative contributions of DIR and NR was not a goal of the present study; a separate adjustment of DIR and NR is also not intended within the software of the manufacturer. It is notable that some hearing-impaired participants of the present study showed lower (better) SRTs with DIR + NRstrong than the reference SRT range obtained for young normal-hearing listeners.
Practical Consequences for HA Fitting
Predicting the speech intelligibility benefit that a patient would achieve from a specific DIR + NR strength for the first fit, or even before, would be a major advantage for audiologists, in selecting the most appropriate DIR + NR strength and for convincing patients to use a specific type of acoustic coupling. Note that “most appropriate” may have several different aspects. In the context of this manuscript, we only investigated how different audiological factors are correlated with SRT benefit from the strongest DIR + NR. Correlations of SRT benefit to other DIR + NR settings or the inclusion of other patient-specific criteria, such as personal preference (situation-specific or overall), spatial perception, or sound quality may also be important in finding the most appropriate setting, but are not considered in the present study. Therefore, the results of the present study can only be one out of many steps toward this goal. The results indicate that the SRT benefit in a realistic speech-in-noise setting with strong DIR + NR strongly depends on the closedness of the acoustic coupling, but also on the patient's hearing ability (quantified either using the ACT™ or the audiogram). The best way to predict the SRT benefit seems to be to measure either of these two and the REOIG in the individual patient, because the REOIG can vary considerably across patients, even for the same earpiece (Cubick et al., 2022), as our data confirmed. If REM verification which is the current best practice in many countries for HA fitting is done, the REOIG can be measured within the same process, thus not adding substantially to the time needed, because the REOIG requires only one additional measurement, the REOG. The REUG is measured during REM verification. Compared to the real-ear-aided gain there is no additional placement of in-situ tubes or change in the position of the HA needed for the REOG, as the HA can simply be turned off to measure the REOG.
Even before an earpiece is manufactured (for individual earmolds) or put into place (for instant fit ear tips), an estimate of the potential SRT benefit attainable with it is possible based on the average closedness of each earpiece. If the individual avREOIG is replaced by the average REOIG of each earpiece, the SRT benefit variance explained by the combined REOIG + ACT linear model drops from 44.1% to 39.9%. This still indicates a fair predictive value. HA manufacturers could implement such a prediction in their fitting software. Such a prediction would get even more precise if more data (ACT values and/or individually measured REOIG) were available.
The actual speech intelligibility benefit achieved in everyday life is surely sound environment-dependent (see Bentler, 2005 for a review), such that the predictions of the present study will not be generally applicable to all acoustic scenes. However, spatial scenes with competing noise sources from different directions are fairly common for HA users, such that the results of the present study indicate at least one ecologically valid acoustic scenario. Still, research is needed to predict the benefit of DIR + NR in other complex or simple acoustic scenarios.
Conclusions
Four audiological factors were correlated (both in isolation and combined) with individual SRT benefits from DIR in combination with spectral NR (DIR + NR) in a state-of-the-art HA. When a large variety of acoustic coupling types was used. The following conclusions can be drawn:
Closedness of acoustic coupling, as measured using avREOIGs had the strongest correlation with SRT benefit. An as-closed-as-acceptable coupling for fitting HA is recommended to maximize the SRT benefit. Spectro-temporal modulation sensitivity (or audible contrast), as measured using the ACT™, was the second-strongest predictor, having the same correlation with SRT benefit as the audiogram. This indicates that the SRT benefit is as much related to suprathreshold hearing abilities as to loss of audibility. Age was not significantly correlated with SRT benefit. However, speech-in-noise performance in the unaided condition (without HA) was correlated with SRT benefit, indicating that HA users with poor SRTs benefitted most from DIR + NR.
