Welcome to the new IOA website! Please reset your password to access your account.

Self-determined hearing through artificial intelligence (AI)

Peggy Sylopp 1

sincEARe UG Strelitzer Str. 60, 10115 Berlin, Germany

Tobias Bruns 2

Fraunhofer Institute for Digital Media Technology (IDMT), Oldenburg Branch for Hearing, Speech and Audio Technology (HSA) Marie-Curie-Straße 2, 26129 Oldenburg, Germany

ABSTRACT Hearing disorders in industrialized countries are widespread. In Germany, there are approximately 14 million cases, and it is one of the most common diseases. 75% of those affected don’t use hearing aids, and accept possible personal consequences such as unemployment, depression and dementia that also have larger economic implications. Studies have proven that individualized sound adjustment can result in better quality of life. In the rapidly growing hearables (devices like headphones, headsets and hearing aids) market, sound personalization is an emerging trend. Most recent high-priced developments in the field of hearing aids integrate AI based self-adjustment of sounds. In this article, we introduce new approaches to integrating AI into hearing aids for more advanced self-determined hearing. We examine the benefits and limitations of various applications of AI in hearing aids today, and discuss further developments of AI in hearing care that might be available in the future.

Wr. | -i-| ox)

Keywords: Hearing aid; artificial intelligence, machine learning; deep neural networks; self- adjustment; speech intelligibility; sound balance; product comparison

1. INTRODUCTION

Artificial Intelligence nowadays is mostly referring to some form of deep learning running on artificial neural networks that mimic neural connections in the brain. This paper gives an overview of new approaches in integrating AI in hearing aids for more self-determined hearing. In order to achieve a more user-driven approach to individualized sound in hearing aids, machine learning, a domain in the field of artificial intelligence, has been employed. Hearing aids process data received from users and acoustic environments and perform complex tasks autonomously or adaptively, learning from accrued experience to improve results. In this paper, we present and discuss current AI applications that are already integrated into commercially available hearing aids, which we consider to be particularly effective for the enrichment of everyday use. These advancements in AI can be a turning point for the future of hearing aids, and we discuss the further potentials thereof. The fitting of hearing aid amplification algorithms are based on scientific evaluation in lab measurements and the audiogram, and address the average user and listening environment by incorporating speech intelligibility and loudness perception models as in the NAL-NL2 procedure (Keidser et al., 2011). However, a growing body of hearing research studies shows that hearing

1 sylopp@pexlab.space 2 tobias.bruns@idmt.fraunhofer.de

x)

preferences vary significantly between people, even if they have the same hearing ability (Nelson et al., 2018; Johansen et al., 2018). The prescriptive fitting rules for average users in an average listening environment can only be seen as a starting point for further fine tuning to the user’s subjective needs. Fitting formulae can be individualized to some extent following parameters like gender, hearing aid experience or age. General settings like preferred gain levels, as well as, loudness and discomfort levels vary substantially between individuals. But also, preferred settings of the same user in specific listening situations vary, and importantly, the intention of the user plays a key role. For example, whether the user intends to actively listen to a concert or passively listen to music at a cafe accounts for differences in their preferences for sound modulation. The common practice amongst hearing acousticians is to iteratively fine-tune the hearing-aid’s fit in successive appointments with individual patients. However, this process is time consuming. Patients can have difficulties describing their perceptions, and audiologists have difficulties interpreting the users’ descriptions. To overcome this barrier, methods of self-fitting by the hearing aid user have been implemented and evaluated (Nelson et al., 2018; Chalupper et al., 2009; Dreschler et al., 2008; Gößwein et al., 2022). The result of the self-fitting may be different when set for optimal speech understanding or for pleasantness (Gößwein et al.2022) or for speech in noise. In order to handle different environments, hearing aid wearers could store different presets, but this would still require selecting preferences manually. To automate these different settings machine learning comes into play. To better adapt hearing aids to individuals’ needs in any given situation, research is being done to integrate machine learning, either for automatically selecting presets according to the current environmental, acoustical inputs, or to optimize a general sound setting to better fit all environmental situations. Studies on trainable hearing aids investigate how hearing aid parameters like loudness, frequency weighting and compression curves could be self-fitted by the user with machine learning tools (Dillon et al., 2006; Zakis et al., 2007; Chalupper et al., 2009; Convery et al., 2011). One commercial product that resulted from such research is Siemens’s SoundLearn technology, for example. In recent years, artificial intelligence became famous through emerging technologies like speech recognition. The advancements in chip technology allow such learning methods to be performed on smartphones and even integrated in hearing aids. The current, high- priced developments in the field of hearing aids integrate AI-based self-adjustment of sounds.

Wr. | -i-| ox)

2. TECHNOLOGY OVERVIEW

Some hearing aid manufacturers such as Signia, Widex, Oticon, Starkey and Phonak (see table1) have already introduced a form of artificial intelligence in their hearing aids, which today are often based on deep neural networks (DNN). Quite common is the use of machine learning for selecting predefined hearing settings based on acoustic environment classifications (AECs). AEC algorithms are trained on a vast amount of sound recordings, and analyze and classify the current user's acoustic environment. The AEC of Starkey uses the information of the AEC to control predefined settings for noise reduction, directionality, and gain (Fabry & Bhowmik, 2021). This is incorporated in Starkey’s “Edge Mode” of a smartphone microphone or an external microphone array for an optimized voice enhancement, and these devices also provide processing power for edge computing. Additional processing power might look like an advantage but has the downside, that users always need to carry an edge device, the microphones always need to be manually placed, and there is added delay due to the Bluetooth connection. Due to these limitations, this solution is only viable for users with a pure tone average hearing loss of greater than 50 dB HL (Fabry & Bhowmik, 2021; Cook, 2020). Phonak offers a DNN-based AEC within the hearing aid that allows

x)

it to seamlessly blend between environments with different sound settings and that also uses the hearing aid’s motion sensor to distinguish between listening situations (e.g., Sport, Pub). The following table gives an overview of hearing-aid manufactures that offer AI or machine learning capabilities:

Table 1: Technology overview about AI integration in current hearing aids.

Company Signia Widex Oticon Starkey Phonak

Name Signia Assistant

Sound Sense Learn

More Sound IntelliVoice AutoSense OS / Speech Enhance Year 2020 2018 2021 2020 2021

User Adjustmen t / Labeling method

Recommend. System

A/B comparison and degree of preference

ratings of process. 3D sound scenes

analyzing individual’s user sound environments (edge mode) / labeled environments

labeled environments and motion sensor data

Wr. | -i-| ox)

Learning Paradigm

DNN / SOM?

Bayesian optimization

Recurrent Network / LSTM

DNN-based feature classification of individual’s user environments

DNN-based feature classification

Goal individualize d sound

individualize d sound

sound enhancement

AEC for controlling noise reduction, directionality and gain settings

sound optimization by blending between environment settings Cons lesser control of the user

higher adjustment effort

no individual sound

delay of smartphone processing, only usable for higher hearing loss.

No AI-based sound optimization

Pros quick results, based on experience

precise audio preference based settings

clear and distinct sound

edge computing of external microphones (smart-phone)

seamless blending between settings in mixed environments

Other technologies exist for sound enhancement that try to replace traditional algorithms for hearing aids, such as beamforming and noise reduction. These are based on simplistic situational and acoustical assumptions, such as, received speech from a frontal direction with static environmental background noise. The manufacturer, Oticon, has developed a data-driven approach to a DNN- based on daily life listening environments that has learned to distinguish which acoustical information belongs to the foreground and that which belongs to the background and can be reduced (Santurette & Behrens, 2020; Brændgaard & Loong, 2020). The sound scene is then

x)

cleaned in a way to help the users to make better sense of their environment that is often perceived as blurred. Together with spatial balancing of the analyzed environment while maintaining binaural cues, this approach offers great potential towards solving remaining issues in the perception of sound environments that arise with conventional sound processing in hearing aids. We consider this approach as trend-setting for more user-oriented development of hearing aid algorithms and review the comprehensive publications of Santurette & Behrens (2020) and Brændgaard & Loong (2020) in chapter 3. The final step towards self-determined hearing is, of course, the training of individualized hearing aid settings based on the user's input. Such trainable hearing aids have been around for some time (Chalupper et al., 2009) but recent developments in AI technology bring this training to another level. Employees of Widex and WS Audiology published an article (Balling et al., 2021) describing an AI-driven self-adjustment method via A-B comparisons. We consider this article as especially valuable because it contains comprehensive data and analysis that provides good insight into the user-driven application of AI in everyday life. For this reason, we place a special focus on the review of this publication in chapter 4.

Wr. | -i-| ox)

3. AI FOR SOUND ENHANCEMENT OF REAL ENVIRONMENTS

In this chapter, we discuss the MoreSound Intelligence (MSI) feature of Oticon More™ technology, as it is documented in the publications by Santurette & Behrens (2020) and Brændgaard & Loong (2020). The MSI approach is to make hearing-aid technology open up to all meaningful sounds, not only speech-like sounds, to create a clear and naturally perceived sound contrast between the important elements in the scene and the background. That is to be achieved by letting the meaningful sounds (such as speech, music, and important environmental sounds) stand out from background sounds (such as babble or noise), while preserving access to all sound sources and all directions that have distinct information. As all sound information is accessible to the brain, users should be able to better focus on, understand and remember sounds of interest, as documented in clinical research (Santurette et al., 2020). This approach is fundamentally different from the conventional components of existing hearing aids algorithms, which can reduce the perception of sound scenes in different ways: Directionality reduces access to sounds from the sides and the back of the user; Noise reduction reduces access to all sounds that are not seen as speech; Compression reduces access to important sound details that matter to the brain, and feedback management reduces access to optimal gain in dynamic situations. The sound processing steps of the MSI include Spatial Clarity Processing, followed by Neural Clarity processing. Additionally, a sound enhancer provides dynamic sound details when noise suppression is activated, predominantly in difficult sound environments. The processing step of Spatial Clarity Processing distinguishes between difficult and easy listening environments. In easy sound environments it simulates spatial hearing by mimicking the function of the outer ear, called the Virtual Outer Ear (VOE). For the VOE, three pinnae shapes are available, which are selected according to the shape of the individual user’s pinna to simulate the user's usual spatial hearing. In difficult listening environments, the processing of the spatial hearing is taken on by a spatial balancer, supplied with an omnidirectional microphones signal and a beamformer-based on Minimum-Variance Distortionless Response (MVDR). The next MSI processing step, the Neural Clarity Processing, balances the sound scene by a DNN that mimics how a human brain works. Being trained on millions of real life sound scenes the DNN has learned which elements of real sound scenes carry more information and which don’t, and what a balanced relationship between these elements should comprise "average user in

x)

average listening environment" standards. The result is a more detailed contrast between the meaningful sounds and the background (Santurette and Behrens 2020, pp. 4).

3.1 DISCUSSION ON MORESOUND INTELLIGENCE (MSI)

The MSI technology represents a big step forward towards more integration of user perception in hearing aid development. The processing chain of MSI distinguishes between easy and difficult listening environments, for which two different processing steps are provided. With future development in AI, perhaps a binary distinction will no longer be necessary, but rather that the different approaches can be interpolated. It could be interesting to observe and analyze how different user-specific listening situations and intentions affect perceptual needs. AI could be used in the analysis, as well as, in the application of appropriate adjustments. It is conceivable that future AI developments will allow even more customization, for example, by scanning the individual pinna of the user and determining the DI.

4. AI FOR SOUND INDIVIDUALIZATION

Wr. | -i-| ox)

Balling et al. (2021) present an AI approach to sound individualization driving the Sound Sense Learn (SSL) technology. This technology enables hearing aid wearers to adjust the sound in challenging everyday listening situations. For this, AI optimizing A-B comparisons are used, users iteratively choose between two sound settings, and indicate how much one sample is preferred above the other. The stimuli are sampled from the best clusters for the activity that the user has indicated. A-B samples are presented and selected consecutively, until the user has found the preferred balance. The integrated Bayesian optimization takes on the task of optimizing the following proposed sound setting in the iterative sound selection process, such that only a few selection steps are necessary. The individualization of the sound adjustment is driven forward particularly by an active learning approach entailed by the machine learning model continuously updating based on user input (see figure 1). The Bayesian model used here processes the knowledge on previous A-B comparisons to select the next A-B samples. The Bayesian Gaussian process (GP) assumes the user's preferences for sound as an internal preference function that is approximated by evaluating the data from the A-B comparisons. An Expected Improvement (EI) measures the amount of improvement per A-B comparison that can potentially yield to the maximal value of the function. After each response from the user, the Bayesian Gaussian process (GP) model is updated, and the EI criterion is evaluated to find a new A-B comparison that will yield to the optimal setting for the user.

x)

Figure 1: The actor (hearing aid user) in a daily life sound scene gets an optimized setting via selection of A-B sound samples comparisons. More than 2000 settings are possible, but the Bayesian optimization reduces this to an amount of maximum 24 sound sample comparisons.

Via data exchange on a cloud server, the clinician can observe the personal programs that the user has created, what settings were established and how much each program has been used. If there are systematic trends, it is the clinician’s responsibility to implement a more universal adjustment of the gain. Three laboratory studies yielded promising results. One laboratory study showed that 8 out of 10 participants achieved satisfactory sound adjustments after using SSL. The results of two additional laboratory studies (no number of participants given) showed that personalized program settings for basic audio quality were significantly preferred to the baseline setting. The preference of listening comfort and speech clarity were also considered, but in the results, participants did not show significant preference in comparison to baseline settings (Balling et al. 2021, pp 286). In a daily live survey, 118 experienced hearing aid users rated their satisfaction with different aspects of use. 53 of the participants indicated that they had used SSL (Balling et al. 2021, pp 286). It is assumed that everyone felt the need to adjust their hearing aids, which would be due to their assumptions of providing a well-fitted hearing aid. As part of the daily live survey, data was collected and stored on a cloud computer the following data were stored and analyzed: The settings and amount of use of the created programs, the activities and intentions indicated by users when SSL was used, and the settings compared and the associated degree of preference. For the analysis of individual preferences in individual listening environments 20.000 user selected programs for gain settings of bass, middle and treble frequencies were stored. The settings were almost evenly distributed across all selectable settings as can be seen in figure 2.

Wr. | -i-| ox)

Figure 2: This schematic graphic (based on Balling et al., 2021, page 288) visualizes the distribution of individual preferences in individual listening environments of 20.000 users.

As part of the data collection, users also indicated the activities in which they made a sound adjustment with SSL. 31.772 activities and the associated user-selected sound adjustment programs were analyzed. The activity “watching TV” was the overwhelmingly most frequent activity indicated by users. Second was the activity “socializing”, competing with “speech”. Besides the most common intentions of conversation and suppressing disturbances, another major type of intention was enjoying sound and music. According to Balling et al. (2021), TV is likely to be a

x)

relatively easy situation to create SSL programs for, in contrast to a one-to-one conversation, where it’s likely to be more difficult to systematically complete the process of A-B comparisons. Users could also rate the degree of preference for their chosen program after A-B comparisons. 10,000 adjustment setting programs for daily life were chosen and acoustical and user inputs were analyzed using machine learning. The result of these analyses was a defined number of clusters for each activity. For example, while there was a single cluster for transport, i.e., generally turning down the volume, the programs for watching TV were distributed over more clusters, indicating diverse preferences (see figure 3).

Wr. | -i-| ox)

Figure 3: These schematic graphics (based on Balling et al., 2021, page 291 ) give an impression of the significant characteristics of the cluster distribution as an example for the sound scenes transport and TV. For transport only one cluster occurred: the general turning down volume. For watching TV the programs were distributed over more clusters.

4.1 DISCUSSION ON SOUND SENSE LEARN (SSL)

The paper indicates that there is a wide range of individual preferences within individual listening environments, rather than generalized group preferences. There are obviously no default settings that can be generalized, which raises the question of whether or not basic settings should be fundamentally reconsidered as a static average function, like current prescriptive rules. AI could play an important role in such considerations and optimize the fine-tuning to individual environments using machine learning models capable of generating individual baseline settings automatically. The cluster distribution for the sound scenes transport shows that transport sounds are perceived as noise. This is a case where a personalized program is clearly preferred over a default setting: the personalized, AI-assisted program could potentially replace the general default hearing aid setting. Since SSL can only be used in situations with sufficient time, this could indicate that the data collected via cloud was biased in terms of the prevalence of activities in which users needed sound adjustments. That means it is conceivable that there were difficult sound scenes where the need for sound adjustments existed, but was not carried out due to insufficient time. It is also mentioned that a satisfactory result was often achieved in less than the maximum 24 A-B selection steps, because convergence is already achieved. The findings could also be situation- related, since the selection process requires a high level of concentration from the user. This makes it clear how much an efficient user-driven AI is importantly intertwined with a user-friendly process for sound adjustment. In the SSL application discussed here, AI is used to optimize the fitting process at A-B iterations and allows users to spontaneously adjust the sound in everyday situations. From the scientific perspective the results provide a detailed insight into individual hearing preferences due to the ability to calibrate nuanced sound settings. However, the adjustment process

x)

requires a lot of attention and can only be carried out in an iterative process, which means that everyday use is only possible to a limited extent. Studies have shown a good test-retest reliability for user settings derived in a production task setting sound and loudness directly (Gößwein et al., 2022; Nelson et al., 2018; Rennies et al., 2016). The authors state that in some cases users may not be able to understand the adjustment controls, and the need for something they cannot change might lead to an unsuccessful trial and error process. The authors do not show evidence if these users who have difficulties to adjust sound and loudness directly achieve better results by adjusting sounds via A-B comparison. It remains unclear if the A-B comparison task is a viable process to create better fits in everyday life for a given user than setting sound and loudness directly. A possible solution to this dilemma between precision and simplicity a could be a self-fitting setting e.g. as described by Gößwein et al. 2022, followed by an A-B comparison task on demand, which could combine the best of both worlds: quick changes in conversations and comparison tasks for higher precision. In this way, individual sound adjustments could also be made possible for listening situations in which users need quick sound adjustments. The authors think that assumptions about user motivation based solely on the technical data collected may not be correct. For example, 53 of 118 participants did not use SSL and it is assumed that the hearing aid is already well-fitted . Another example is, that Balling et al. (2021) assume user satisfaction if more than a third of users do not use SSL or fewer than 24 A-B comparisons have been carried out. W ith the help of subjective user feedback, these assumptions could be validated.

Wr. | -i-| ox)

5. CONCLUSIONS

There are many ways of integrating AI into hearing aids, providing more sound control and additional features to the user. Most applications focus on speech intelligibility, but other approaches emphasize a balanced audio scene. Self-adjustments allow user-driven sound adaptation in challenging hearing situations. Using AI, hearing aid users are given the opportunity to select preset programs for more speech intelligibility or to create individual user-specific programs for challenging hearing situations. All these applications are promising and show the potential of AI for hearing aids. Even though the AI applications address individual needs, the data analyses still focuses on the average or the usual hearing loss. A next step for more individualization for the benefit of the user’s satisfaction could be that AI helps to individualize the sound adjustment process in a way that it is easier and quicker to use, sound adjustments can be done in fine gradations and in bigger ranges so that basic settings could also be affected. The individualization of sound adjustments for user-specific listening environments, driven by self-fitting in combination with AI, will be groundbreaking for future developments for self-determined hearing with hearing aids. 6. ACKNOWLEDGEMENTS

We thank Jie Liang Lin for her English language support.

7. CONFLICT OF INTERESTS

P. S. is CEO of sincEARe UG, T.B is a full-time employee of Fraunhofer IDMT Oldenburg.

7. REFERENCES

Balling, L. W., Mølgaard, L. L., Townend, O., & Nielsen, J. B. B. (2021). The Collaboration

x)

between Hearing Aid Users and Artificial Intelligence to Optimize Sound. Semin Hear , 42 (03), 282–294. Brændgaard, M., & Loong, B. M. K. (2020). An introduction to MoreSound Intelligence TM . Oticon

White Paper. https://wdh01.azureedge.net/-/media/oticon/main/pdf/master/whitepaper/ 69674uk_tech_paper_moresound_intelligence.pdf? la=en&rev=3F19&hash=6A1037B3951F262345E45C 4A725D3CC7 Chalupper, J., Junius, D., & Powers, T. (2009). Algorithm lets users train aid to optimize

compression, frequency shape, and gain. The Hearing Journal , 62 (8). https://journals.lww.com/thehearingjournal/Fulltext/2 009/ 08000 / Al gor i t h m _l et s _us er s _train _aid_to_optimize.5.aspx Cook, D. (2020). AI can now help you hear speech better. Hearing Loss Journal .

https://www.hearinglossjournal.com/ai-can-now-help-you-hear-speech/https:// www.hearinglossjournal.com/ai-can-now-help-you-hear-speech/ Keidser, G., Dillon, H., Flax, M., Ching, T., & Brewer, S. (2011). The NAL-NL2 Prescription

Procedure. Audiology research , 1 (1), e24. https://doi.org/10.4081/audiores.2011.e24 Convery, E., Keidser, G., Dillon, H., & Hartley, L. (2011). A self-fitting hearing aid: Need and

Wr. | -i-| ox)

concept. Trends in Amplification , 15 (4), 157–166. https://doi.org/10.1177/1084713811427707 Dillon, H., Zakis, J. A., McDermott, H., Keidser, G., Dreschler, W., & Convery, E. (2006). The

trainable hearing aid: What will it do for clients and clinicians? The Hearing Journal , 59 (4). https://journals.lww.com/thehearingjournal/Fulltext/2 006/ 04000 / The_ t r ai n abl e_hear i ng_aid __What_will_it_do_for.5.aspx Dreschler, W. A., Keidser, G., Convery, E., & Dillon, H. (2008). Client-Based Adjustments of

Hearing Aid Gain: The Effect of Different Control Configurations. Ear and Hearing , 29 (2). https://journals.lww.com/ear-hearing/Fulltext/2008/04 000 / Cl i ent _Bas ed_Adj us t m ent s _o f_H earing_Aid_Gain__The.7.aspx Fabry, D. A., & Bhowmik, A. K. (2021). Improving Speech Understanding and Monitoring Health

with Hearing Aids Using Artificial Intelligence and Embedded Sensors. Seminars in Hearing , 42 (3), 295–308. https://doi.org/10.1055/s-0041-1735136 Gößwein, J. A., Rennies, J., Huber, R., Bruns, T., Hildebrandt, A., & Kollmeier, B. (2022).

Evaluation of a semi-supervised self-adjustment fine-tuning procedure for hearing aids. International Journal of Audiology , 0 (0), 1–13. https://doi.org/10.1080/14992027.2022.2028022 Johansen, B., Petersen, M. K., Korzepa, M. J., Larsen, J., Pontoppidan, N. H., & Larsen, J. E.

(2018). Personalizing the Fitting of Hearing Aids by Learning Contextual Preferences From Internet of Things Data. Computers , 7 (1). https://doi.org/10.3390/computers7010001 Nelson, P. B., Perry, T. T., Gregan, M., & VanTasell, D. (2018). Self-Adjusted Amplification

Parameters Produce Large Between-Subject Variability and Preserve Speech Intelligibility. Trends in Hearing , 22 , 2331216518798264. https://doi.org/10.1177/2331216518798264 Rennies, J., Oetting, D., Baumgartner, H., & Appell, J.E.. (2016). User-interface concepts for

sound personalization in headphones. journal of the audio engineering society . http://publica.fraunhofer.de/dokumente/N-491411.html Santurette, S., & Behrens, T. (2020). The audiology of Oticon More TM . Oticon.

https://wdh01.azureedge.net/-/media/oticon/main/pdf/ m as t er / whi t epaper / 69619uk_wp_oticon_more_audiology.pdf Yellamsetty, A., Ozmeral, E. J., Budinsky, R. A., & Eddins, D. A. (2021). A Comparison of

x)

Environment Classification Among Premium Hearing Instruments. Trends in Hearing , 25 , 2331216520980968. https://doi.org/10.1177/2331216520980968 Zakis, J. A., Dillon, H., & McDermott, H. J. (2007). The design and evaluation of a hearing aid with

trainable amplification parameters. Ear and Hearing , 28 (6), 812–830. https://doi.org/10.1097/AUD.0b013e3181576738

Wr. | -i-| ox)

x)