A A A Volume : 44 Part : 2 Effects of binaural classroom noise scenarios on primary school children's speech perception and listening comprehension Larissa Leist 1 Technische Universität Kaiserslautern, Cognitive and Developmental Psychology 67663 Kaiserslautern, Germany Carolin Reimers RWTH Aachen University, Institute for Hearing Technology and Acoustics 52062 Aachen, Germany Stephan Fremerey Technische Universität Ilmenau, Audiovisual Technology Group, 98693 Ilmenau, Germany Janina Fels RWTH Aachen University, Institute for Hearing Technology and Acoustics 52062 Aachen, Germany Alexander Raake Technische Universität Ilmenau, Audiovisual Technology Group, 98693 Ilmenau, Germany Thomas Lachmann Technische Universität Kaiserslautern, Cognitive and Developmental Psychology 67663 Kaiserslautern, Germany Centro de Ciencia Cognitiva, Facultad de Lenguas y Educación Universidad Nebrija Madrid, Spain Maria Klatte 2 Technische Universität Kaiserslautern, Cognitive and Developmental Psychology 67663 Kaiserslautern, GermanyABSTRACT Instruction at school relies heavily on oral discourse. Listening comprehension is thus of major importance for successful learning. However, in many classrooms, children’s listening is impaired by unfavourable acoustic conditions such as indoor noise and reverberation. Most studies on the effects of environmental noise on children’s speech perception used simple monaural noise1 lleist@rhrk.uni-kl.de 2 klatte@rhrk.uni-kl.de recordings and basic tasks such as identification of isolated words or syllables. In the current study, we aimed at a more realistic simulation of both the auditory classroom environments and the listening requirements faced by children at school. We analysed the effects of a binaural and a monaural version of a classroom noise scenario on speech perception (word-to-picture matching) and listening comprehension in second-graders (N=37). Differential effects of the sounds were found. In the monaural condition, speech perception was much stronger affected than listening comprehension, and speech perception performance was unrelated to listening comprehension. In contrast, in the binaural condition, both tasks were affected to roughly the same degree (18%), and speech perception performance significantly predicted listening comprehension. The use of realistic binaural auditory scenes provide a promising strategy to increase the external validity of studies on the effects of environmental noise on children’s learning.1. INTRODUCTIONLearning at school relies heavily on oral instruction. Effective listening is thus a key prerequisite for school achievement. However, listening comprehension presupposes adequate acoustic conditions, which are not always present in classrooms. Field studies confirm that noise and reverberation in classrooms have a detrimental impact on children’s learning and well-being at school [1-3]. Developmental psychoacoustic studies revealed that the ability to understand speech in adverse listening conditions improves continuously across childhood, and does not reach adult levels until early adolescence [4,5]. Therefore, students in the early grades are especially affected by noise and reverberation. Experimental studies on the effects of environmental noise on children’s ability to understand speech focused on simple speech perception tasks requiring identification of isolated speech targets in noise and/or reverberation. However, listening requirements faced by children during school lessons go far beyond pure identification. Effective listening in these situations requires storage and processing of complex oral information in working memory, while constructing a coherent mental model of the story meaning [6]. There is evidence that noise may affect storage and processing of heard speech even when the signal-to-noise ratio (SNR) is high enough to allow perfect or near-perfect identification of the speech targets [7-9]. Thus, effects of noise and reverberation on word identification tasks do not allow predictions of decrements in complex listening tasks. In addition, the noise maskers used in psychoacoustic studies on speech-in-noise perception do not reflect the sound environment of children in classrooms. Aiming to explore the impact of noise and reverberation on children’s speech perception in a more realistic, classroom-like setting, Klatte and colleagues [10] found differential effects of single-talker speech and non-speech classroom noise on word identification (word-to-picture matching) and listening comprehension (acting-out of complex oral instructions). SNRs varied between -3 and 3 dB. In the comprehension task, background speech and classroom noise significantly reduced children's performance, with first-graders suffering the most, while adults were unaffected. Background speech was more disruptive than classroom noise. In contrast, word identification was much more impaired by classroom noise when compared to speech. The authors argued that, with the SNRs used in their study, classroom noise and background speech affected performance through different mechanisms. Classroom noise masked the speech signal. This is especially harmful when identification of isolated words is required, as there are no contextual cues available that might be used for reconstructing the degraded input. Background speech was a less potential masker, but interfered with short-term memory processes that children (but not adults) rely on when listening to complex sentences. Research Question In Klatte and colleagues [10], mono recordings of the sounds were used, which were presented via loudspeakers located at sides of the laboratory room. Obviously, the resulting aural impression differs significantly from that evoked in a real classroom environment, where sounds are spatially spread across the room, and sound sources change continuously. In the current study, we aimed to further increase the proximity of reality of the design by Klatte and colleagues [10], by including a binaural classroom noise scenario. Here, we compared the effects of monaural and binaural noise scenarios. Our aim was to find out whether and to what extent the more realistic, binaural and the monaural presentation yield different effects on children’s speech perception and listening comprehension.2. METHODS2.1. Participants A total of 37 second grade children aged between 6;3 and 8;2 (9 females, M =7;5, SD =0;3) took part in the study. The children were recruited via a primary school in Kaiserslautern. All children were native German speakers and had (corrected-to-normal) vision and normal hearing (self-reports and parental reports).2.2. Apparatus The word-to-picture matching task was developed in Python 3.7/PsychoPy 3.1.5 [14], and it was operated using a 15.6-inch laptop computer (HP ProBook 450) running Microsoft Windows 10. The display had a resolution of 1920 × 1080 pixels and a refresh rate of 60 hertz. The sounds were delivered using headphones (Sennheiser HD650) and an audio interface (Focusrite Scarlett 2i2 2 nd Generation). We put images of an elementary school classroom around each workstation to create a more authentic environment.2.3. Tasks We used modified versions of the tasks from Klatte and colleagues [10]. Each task was constructed in three parallel versions.. Speech perception was assessed by means of a word-to-picture matching task requiring discrimination between phonologically similar words. A total of 84 lists of four phonologically similar German nouns were created (e.g. Kopf (head) , Topf (pot) , Knopf (button) , Zopf (braid)). A simple and easy-to-name colored drawing represented each word. Each trial started with a visual cue presented for 1.5 seconds, followed by a spoken word. Then, a screen with four pictures was shown, one representing the target word, and three representing similar-sounding distractors. The child´s task was to mouse click on the picture that corresponded to the target word. In each sound condition, 28 trials were performed. Listening Comprehension was assessed via a paper and pencil test requiring the execution of complex oral instructions. In each of the sound conditions, participants heard 8 oral instructions spoken in a female voice, for example, “Male ein Kreuz unter das Buch, das neben einem Stuhl liegt ” (“Draw a cross under the book that lies next to the chair ” ). The task was to carry out the instructions on prepared response sheets. Each instruction was represented on the response sheet by a row with tiny black-and-white drawings showing the target objects (e.g., a book lying next to a chair) and distractor stimuli (e.g., a book lying next to a ball). The response sheet was also visible on the computer screen, with a red arrow indicating the row representing the current instruction. Each trial started with an auditory cue, followed by the oral instruction. After offset of an instruction, the participants had 18 seconds to complete the entries on the response sheet. Scoring was based on the number of elements correctly executed according to the respective instruction. 2.4. Sounds Speech signals: The words and instructions were read by a professional female speaker in a sound- proof booth. Mono recordings were produced with a sampling rate of 44.100 Hz and 16-Bit- resolution. Auditory classroom scene : The auditory scene represented a classroom-like auditory environment with everyday classroom activities, e.g., furniture use, desk noise including writing and use of other stationary items, footsteps, door opening and closing, zipper on bags, undoing a plastic wrapper, and browsing a book. The anechoic background sound was presented in a monaurally and binaurally synthesized condition. The background sound was created by placing realistic sound sources in a modeled 3D classroom in SketchUp and the respective sounds were rendered using RAVEN, a room acoustic simulation tool developed at ITA [11]. Sixteen sound source locations were evenly distributed across the room. To prevent any learning effects, the different noises were distributed irregularly (in space and time) as in real classroom scenarios. Some noises (like writing) were played more often than others (like door) to match their frequency in reality. Four children talking in Hindi, a language that none of the participants were able to understand and speak, were added to the scene. In both the mono and binaural conditions, two talkers were active at any time, their order changing randomly. In the monaural condition, all the sounds are presented spatially undifferentiated, and seem to originate from straight ahead. The binaural condition was created using a generic HRTF from the FABIN dummy head [12]. In the binaural condition, the sounds are spatially spread across the room and the four spatial locations of the talkers changed randomly. We know that HRTFs differ a lot between adults and children [15] and we also know that also in cognitive tasks such as switching auditory attention the type of binaural reproduction will lead to significant differences [16]. However, in this experiment the first attempt was made to spatially separate the talkers and we decided for a generic solution of HRTFs. The overall presentation level of both mono and binaural sounds was L Aeq, 1m of 60 dB. SNRs were - 3 dB. In the silent control condition, an air-conditioning noise of L Aeq, 1m = 41.5 dB was audible.2.5. Design and Procedure Each child performed both tasks in each of the three sound conditions (silent control, monaural auditory scene, and binaurally synthesized auditory scene). Sound conditions were varied between blocks. Order of sound conditions, and the allocation of test versions to sound conditions, were counterbalanced between participants. Testing was performed in groups of three to four at the TU Kaiserslautern in a sound-attenuated booth. The booth was equipped with four computer workstations, with a distance of about 4 meters between them. Each session started with a general introduction provided by the experimenter, followed by a presentation of 4 second excerpts of the binaural and monaural classroom sounds. Then, all the pictures used in the word identification task were introduced, accompanied by the respective word presented via headphones. Subsequently, the children performed the word identification task. Thereafter, the listening comprehension task was instructed and performed. Both tasks started with four practice trials. In the sound conditions, the sound was played throughout the block of trials. The session took about 40 minutes in total. The study was approved by the Rhineland-Palatine school authority and by the ethics committee of the TU Kaiserslautern. Informed written consent was provided by the parents of the children. 3. RESULTSFor the analyses, raw scores of both tasks were transformed into proportion correct scores. Mean proportion correct scores with respect to task and sound condition are depicted in Figure 1. These scores were analyzed using a 2x3 repeated-measures factorial design. The ANOVA was performed on the within-subject factors task (listening comprehension vs. speech perception) and sound condition (silence, binaural classroom scenario, monaural classroom scenario). Mauchly’s test revealed that the assumption of sphericity was violated for the interaction between task and sound condition χ 2 (2)= 8.54, p = .014. Therefore, degrees of freedom were corrected using Huynh-Feldt estimates of sphericity (ε = .86) [13]. The ANOVA revealed significant main effects of task, F (1,36)=26.6, p <.001, partial η 2 =.43, and sound condition, F (2,72)=213, p <.001, partial η 2 =.86. Furthermore, there was a significant task x sound interaction, F (1.71,61.6)=47.2, p <.001, partial η 2 =.57, reflecting that the sound effects differ between tasks. In order to further explore this interaction, separate analyses were performed for both tasks. For speech perception, the analysis confirmed a significant effect of sound condition, F (2,72)=287, p <.001, partial η 2 =.89. Bonferroni- corrected post-hoc tests revealed significant differences between the sound conditions (all p < .001). Performance in the silent control condition was nearly perfect ( M = .97, SD = .048), and significantlybetter whenFigure 1 Performance of 2 nd grade children in the listening comprehension task (left panel) and speech perception task (right panel) with respect to the sound condition (silent control, binaural classroom noise scenario, monaural classroom noise scenario). Error bars denote bootstrapped confidence intervals.compared to both noise conditions. Performance in the binaural condition ( M = .74, SD = .11) was better when compared to the monaural condition ( M = .50, SD = .13). For listening comprehension, the analyses also confirmed a significant main effect of sound, F (2,72)=34.4, p < .001, partial η 2 = .73. Performance in the silent control condition was near-to-perfect ( M = .94, SD = .005), and significantly better when compared to the noise conditions ( p < .001), which did not differ ( M = .76, SD = .14 in both conditions). Further analyses revealed that speech perception performance in the binaural condition significantly predicted listening comprehension in the binaural condition ( r = .375, p < .05), whereas for the monaural conditions, speech perception and listening comprehension were unrelated (p = .19). 4. DISCUSSION AND CONCLUSIONThe current study confirmed that the effects of classroom noise on children’s speech perception depend on the method of sound presentation. With binaural presentation, children´ ability to identify isolated spoken words was significantly less affected when compared to the monaural presentation of the same sound. This finding indicates that, with binaural presentation, children are able to use spatial cues for separating the signal from the background noise. We may conclude that, in studies using simple monaural sound presentation, the effect of environmental noise on children’s speech perception in real-life situations might be over-estimated.In the current study, a task requiring comprehension of complex oral instructions was included, in order to simulate listening requirements that children face during school lessons. Performance in this tasks dropped by 18 % in both sound conditions. The fact that, in the monaural conditions, listening performance was much less affected than speech perception (i.e., word identification) replicates the finding of Klatte et al. [10], and may be explained by children’s ability to reconstruct elements that are masked by the background noise through usage of contextual cues. However, speech perception performance in the monaural condition did not predict listening comprehension in noise. This casts further doubts on the validity of studying the effects of simple monaural noise recordings on word identification in order to estimate noise effects in everyday listening situations.In contrast, in the binaural conditions, speech perception and listening comprehension were impaired to roughly the same degree, and speech perception significantly predicted listening comprehension. We may thus conclude that, with binaural presentation of the noise, effects on word identification provide a more valid estimate of effects on complex listening tasks when compared to simple monaural presentation. Furthermore, the significant correlation between word identification and listening comprehension in the binaural conditions indicates that the same mechanisms are at play in both task. In view to the spatial distribution and change of the sound sources, the binaural scene may divert children’s attention away from the focal task. However, for the listening task, due to the spatially separable speech streams, impairments of short-term memory processes may also play a role.To summarize, the current study confirmed that using realistic binaural auditory scenes is a promising strategy to increase the external validity of studies on the effects of environmental noise on children’s learning.5. ACKNOWLEDGEMENTSThis research was funded by the German Research Foundation (DFG, project ID 444697733) with the title ”Evaluating cognitive performance in classroom scenarios using audiovisual virtual reality – ECoClass-VR”. We thank all children, teachers and parents for their cooperation in the current study. We also want to thank Manuj Yadav for creating the classroom scenarios. 6. REFERENCES1. Astolfi, A., Puglisi, G. E., Murgia, S., Minelli, G., Pellerey, F., Prato, A., & Sacco, T. Influence of classroom acoustics on noise disturbance and well-being for first graders. Frontiers in Psychology , 2736 (2019). 2. Klatte, M., Hellbrück, J., Seidel, J., & Leistner, P. Effects of classroom acoustics on performance and well-being in elementary school children: A field study. Environment and Behavior, 42 , 659–692 (2010).3. Mogas Recalde, J., Palau, R., & Márquez, M. How classroom acoustics influence students and teachers: A systematic literature review. JOTSE: Journal of Technology and Science Education , 11(2) , 245 – 259 (2021). 4. Talarico, M., Abdilla, G., Aliferis, M., Balazic, I., Giaprakis, I., Stefanakis, T., ... & Paolini, A. G. Effect of age and cognition on childhood speech in noise perception abilities. Audiology and Neurotology , 12(1) , 13 – 19 (2007). 5. Klatte, M., Bergström, K., & Lachmann, T. Does noise affect learning? A short review on noise effects on cognitive performance in children. Frontiers in psychology , 4 , 578. 6. Kintsch, W. (1988). The role of knowledge in discourse comprehension: a construction- integration model. Psychological review , 95(2) , 163 (2013). 7. Kjellberg, A., Ljung, R., & Hallman, D. Recall of words heard in noise. Applied Cognitive Psychology , 22(8) , 1088–1098. (2008). 8. Hurtig, A.; van de Keus Poll, M.; Pekkola, E. P.; Hygge, S.; Ljung, R.; Sörqvist, P. Children's recall of words spoken in their first and second language: Effects of signal-to-noise ratio and reverberation time. Frontiors of Psychology, 6 , 2029 (2015). 9. Ljung, R., Sörqvist, P., Kjellberg, A., & Green, A.-M. Poor Listening Conditions Impair Memory for Intelligible Lectures: Implications for Acoustic Classroom Standards. Building Acoustics , 16(3) , 257–265 (2009). 10. Klatte, M., Lachmann, T., & Meis, M. Effects of noise and reverberation on speech perception and listening comprehension of children and adults in a classroom-like setting. Noise & Health, 12(49) , 270-282 (2010).11. Schröder, D., & Vorländer, M. RAVEN: A real-time framework for the auralization of interactive virtual environments. Forum acusticum . Denmark: Aalborg (2011). 12. Brinkmann, F. (2017). The FABIAN head-related transfer function data base. 13. Girden, E. (1992). ANOVA. SAGE Publications, Inc. 14. Peirce, J. W. PsychoPy—psychophysics software in Python. Journal of neuroscience methods , 162(1-2) , 8-13 (2007). 15. Fels, J. & Vorländer, M. 'Anthropometric Parameters Influencing Head-Related Transfer Functions', ACTA ACUSTICA united with ACUSTICA 95(2) , 331-342 (2009). 16. Oberem, J.; Lawo, V.; Koch, I. & Fels, J. 'Intentional Switching in Auditory Selective Attention: Exploring Different Binaural Reproduction Methods in an Anechoic Chamber', Acta Acustica united with Acustica 100(6) , 1139-1148 (2014). Previous Paper 391 of 808 Next