A A A Serial recall performance under di ff erent room acoustic conditions Jan Selzer, Florian Schelle 1 Institute for Occupational Safety and Health of the German Social Accident Insurance Alte Heerstr. 111, 53757 Sankt Augustin, GERMANY André Fiebig 2 Technische Universität Berlin Einsteinufer 25, 10587 Berlin, GERMANY ABSTRACT In o ffi ce workplaces high demands on the concentration of employees are set. This is contrasted by the e ff ect of irrelevant speech, which has been shown to decrease working memory performance. The presented survey investigates the impact of di ff erent room acoustic conditions on the irrelevant speech e ff ect. To address this question, the decrease of performance was investigated by a serial recall test using a within-subject design. Six stimuli were used for the experiment: silence as reference, pink noise, two speech signals with di ff erent degrees of fluctuation strength, each presented in room acoustic conditions with long and without reverberation. Each stimulus is used in twelve trials. One trial consisted of the representation phase of nine digits, sequentially presented in random order and the recall phase. Furthermore, di ff erent test designs were used. In the first design the playback of the stimulus was continuing during twelve trials. In the second design each trial had a randomized stimulus playback. In addition, the annoyance was assessed, and a closing interview conducted. The results of the study with 44 participants are presented and discussed in this contribution. Overall, a significant decrease in performance is observed in the speech conditions compared to the reference condition. 1. INTRODUCTION 59 % of German employees working in o ffi ces [1]. The activities carried out at those workplaces set high demands on the concentration. Further, in open plan and multi-person o ffi ces there is an increased disruptive e ff ect due to irrelevant speech and sound not connected to the own work assignment. The working memory is decreased and the error rate is increased verifiably by those sounds, summarized by the term irrelevant sound e ff ect [2, 3]. Whereas the impact of irrelevant speech on cognitive performance has already been investigated and modeled in numerous studies (e.g. [4, 5, 6]), ‘ the e ff ect of room acoustics on human performance is unclear when speech is seen as a distractor ’ [7]. 1 jan.selzer@dguv.de, florian.schelle@dguv.de 2 andre.fiebig@tu-berlin.de a slaty. inter.noise 21-24 AUGUST SCOTTISH EVENT CAMPUS O ¥, ? GLASGOW This contribution presents an experiment raising the question to what extent the working memory performance is impaired by irrelevant sound at di ff erent room acoustic conditions. Therefore, the decrease in performance (DP) is captured by a serial recall test. Two counterpoints are chosen as acoustic conditions: a ‘dry’ speech signal recorded in a voice booth and a reverberant, auralized condition in an o ffi ce without any acoustic measures. Accordingly, the working hypothesis H 1 is: The cognitive performance, determined by the decrease in performance of a serial recall test, and the annoyance are a ff ected by a prolonged reverberation time. 2. METHODS A listening test is necessary to address the research question. For the implementation of the listening test, a web based platform was developed. 3 The experiment is designed as within-subject. The participants use a computer mouse, a 24-inch display and a AKG K425 headphone in a common o ffi ce room of the Institute for Occupational Safety and Health (IFA), equipped with a table (size: 140 cm x 80 cm) and a 180 cm x 200 cm sound screen to separate from the experiment supervisor. The invitation to participate was provided within the IFA in the beginning of 2021. Because of the pandemic situation, it was only possible to perform tests with participants working in the institute. A condition of participation is a self-report about the unrestricted or only slightly reduced hearing ability. At the day of participation in the hearing test a written explanation of the test procedure is handed out including the declaration of consent. Further instructions are given by the web-application, e.g. when to put the headphones on, to ignore the sounds played through the headphones and to solve the task without repeating aloud. 2.1. Serial Recall Test The working memory is assessed by a serial recall test as described by S chlittmeier [10]. Nine digits presented sequentially to the subject in random order at 1 digit / s visually via the display (see Figure 1). Beginning and end of the representation are marked by a bullet point ( • ). After the representation phase, the subject needs to recall the sequence. The input is performed via a 3x3 numeric field with randomized number positions, to avoid reminding input pattern instead of the numbers (see Figure 2). Representation and recall are taken to be one trial . For each stimulus presented, twelve trials are performed. Each subject starts with two training sequences and is invited to ask questions anytime. Nevertheless, the design of [10] is variated for this study. This experiment is conducted in two desings: acute and cumulative . The cumulative design is similar to [10] and represents one stimulus during twelve trials (within representation and recall phase). In contrast, the acute design has a randomized playback of stimuli in every trial only during the representation phase. A pause is o ff ered by the software in cumulative design, while acute designs participants are instructed to pause anytime they need, after recall phase. Further, every participant has the opportunity to cancel the experiment at any time without stating reasons and without to fear consequences. 2.2. Stimuli Six stimuli are used in this experiment. The ‘base’-stimuli are silence as reference condition and pink noise. Based on one speech recording, two contrasting speech signals are elaborated with di ff erent degrees of fluctuation strength, each presented with long and without reverberation. The speech 3 A JavaScript based application was developed by the author, to perform the serial recall test and collect the results of the questionnaires in one place. The results were transferred through Node.js to a MongoDB-database. Safari 14 in full screen mode on macOS 10.14.6 was used for the listening tests. Welcome and introduction of the subject Listening test Closing interview "Do you have any comments on the experiment? Did you notice anything in particular?" - Prof. explanation - Declaration of consent - Information on hygenic concept Two training sequences of the serial recall Start Test part LEF-K Take off headphones randomized playback 1/6 stimuli Stimuli left? acute cumulative Input via numeric field Representation Annoyance Serial Recall Stimuli left? randomized playback of a stimulus Pause Representation Input via numeric field Annoyance twelve trials next stimulus Serial Recall 1000 ms digit 1 digit 2 digit 9 ⬤ ⬤ ⬤ ⬤ ⬤ 13 s Figure 1: Flow chart of listening test with two test designs: acute and cumulative ; annoyance is captured using ISO / TS 15666 [8], LEF-K is the short form noise-sensitivity questionnaire [9]. The numeric field input is presented in Figure 2. recording presents the story of the frog king read by a professional male speaker [11]. The recording is provided by the Stimuli-Database of the German Acoustical Society 4 . Breaks were in- or excluded in the time signal of the speech, to reach di ff erent fluctuation strength values (Table 1). In order to investigate di ff erent room acoustic conditions, an acoustically unfavorable two-person o ffi ce was used for auralization purposes [12, p. 36]. The software CadnaR by DataKustik GmbH was used for auralization. The original situation in the sound booth during the recording is stated with a single value reverberation time of 0.2 s [11]. The reverberation time in the auralized situation is about 1.1 s [12]. Additionally, an STI of 0.59 is given. No loudness adjustment is made (see Table 1). The stimuli are played through a AKG K425 headphone connected to an USB-audio interface RME Babyface Pro during the experiment. Acoustic parameters of the stimuli can be found in Table 1. For 4 DEGA e.V., Stimulus-Datenbank: https://www.dega-akustik.de/va/stimulus-datenbank Figure 2: Randomized numeric field with exemplary filled in first two digits. Table 1: Parameterisation of the stimuli via fluctuation strength F , loudness N 5 (DIN 45631 / A1) and sound pressure level L Aeq ; Recording with artificial head during playback from application; upper part of the table corresponds to the stimuli from acute design, lower part of the table after separation line: cumulative design. Stimulus F N 5 L Aeq (Abbreviation) [vacil] [sone] [dB] Pink Noise (PN) 0.014 7.3 48.9 Speech with breaks (SwB) 0.248 10.7 55.6 SwB + Reverberation 0.142 16.7 62.5 Fast speech (FS) 0.191 10.5 56.5 FS + Reverberation 0.075 16.9 63.4 Speech with breaks (SwB) 0.255 11.2 55.5 SwB + Reverberation 0.118 17.8 62.2 Fast speech (FS) 0.240 11.4 56.4 FS + Reverberation 0.072 18.1 63.1 this, a recording of the sounds took place by means of an artificial head HEAD acoustics HSU III.2 . The values are calculated by HEAD acoustics ArtemiS SUITE 12 . 2.3. Questionnaires To assess the annoyance, the ISO / TS 15666 questionnaire is in use [8]. In cumulative test design after the twelve trials, in acute test design annoyance is queried after the first, fourth, eighth and twelfth trial of one stimulus (see Figure 1). The five-point verbal German scale was used with the (translated) question How much does the heard sound bother, disturb or annoy you? Not at all (1), Slightly (2), Moderately (3), Very (4), Extremely (5). After conclusion of all 72 trials, the German short form noise-sensitivity questionnaire has to be completed (LEF-K [9]). Nine items are rated by a verbal four-point scale, which is transferred into values from zero to three. The sum of this questionnaire (maximum 9 · 3 = 27) is evaluated as noise sensitivity value. The median of data is used to separate into two groups: noise sensitive subjects and non-noise sensitive ones. After finishing the experiment on the screen and putting o ff the headphones, a closing interview is conducted with the questions: Do you have any comments on the experiment? Did you notice anything in particular? . 3. RESULTS The experiment took place from March to June 2021 in an o ffi ce of IFA. 44 persons participated voluntarily (20 female). The duration of the listening test lasted from 35 min to 1 h. Before starting the test, the subjects were evenly distributed between the two test designs. Every person stated having unrestricted or only slightly reduced hearing ability. The mean age is 40 years, with a range from 24 to 63 years (SD: 11.6 years). 3.1. Serial Recall Test For the calculation of the DP, every digit memorized on a wrong position is counted as an error. The count of errors in all trials of one stimulus divided by the count of possibly correct answers describes the mean error rate. Relating the mean error rate to the silence condition as baseline intra-individually, a DP is calculated for the remaining sound conditions. PN /Silence *** *** *** *** *** *** *** *** n.s. 0.4 decrease in performance 0.2 0.0 -0.2 PN SwB SwB+ FS FS+ Figure 3: Representation of entire data for decrease in performance per stimulus ( n = 42) with identification of significant di ff erences, paired t -test with B enjamini / H ochberg -adjustment; adjusted p -value: *** p < 0 . 001 (abbreviations see Table 1). Two subjects are excluded from every test design, because they used to have only negative DP. A repeated measures ANOVA for within-subject design is used to elaborate significant di ff erences in DP between the stimuli: cumulative , F (5 , 100) = 6 . 3, p < 0 . 0001, η 2 g = 0 . 127 and acute , F (5 , 100) = 16 . 8, p < 0 . 0001, η 2 g = 0 . 316. Nevertheless, the one-way ANOVA does not show any di ff erences between the DP in the di ff erent test designs for each stimulus. Thus, the entire data is used to determine di ff erences in DP for the various stimuli. The DP is statistically di ff erent at di ff erent stimuli (repeated measures ANOVA: F (5 , 205) = 19 . 5, p < 0 . 0001, η 2 g = 0 . 191). Post-hoc analyses with a B enjamini / H ochberg -adjustment [13] reveal, that all speech conditions are statistically significantly di ff erent to the silence and to the pink noise condition. Whereas, the speech conditions respectively the conditions silence and pink noise do not reveal any statistical di ff erence between each other (see Figure 3). Furthermore, a significant di ff erence in the error rate can be found between the first and last twelve trials in the entire data set regardless of the stimulus heard: di ff erence between mean error rates of the twelve trails: 0 . 2037, paired t -test: p = 0 . 044, see Figure 4. Also, dividing the data into noise sensitive and non-noise sensitive subjects, according to the median calculated value of 14.5 in the LEF-K [9] (see next section 3.2), no indication of any statistic di ff erence between the DP in both groups is visible (Wilcoxon-test: W = 7118, p = 0 . 155). error rate 1.00 0.75 0.50 0.25 0.00 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 Trial No. * } } Figure 4: Representation of entire data for error rate per trial no. ( n = 42) with identification of significant di ff erence between first and last twelve trials, paired t -test with p -value: * p = 0 . 044. 3.2. Questionnaire The annoyance ratings based on ISO / TS 15666 questionnaire are statistically di ff erent between the two test designs (Wilcoxon-test W = 6310 . 5, p = 0 . 0043, r = 0 . 44). A detailed statistic examination reveals the significant di ff erence between silence and other conditions as well as between pink noise and the speech conditions for both test designs (see Figure 5). The results for the LEF-K show two groups of subjects, divided by the median value of 14.5 (max: 26, min: 6). Taking age and gender into account, no statistically significant di ff erences are found for this variables on noise sensitivity for the subjects involved in this study. Nevertheless, the group of noise sensitive subjects tends to be older (avg: 42.5 vs 37.6 years, t = 1 . 42, p -value: 0.164). Further, noise sensitivity does not impact annoyance ratings significantly. In the closing interview the participants report about di ff erent aspects: - the native language of the participants was used to memorize the order of digits, - the first trials were used to find a solution strategy, which was applied until the end of the experiment, *** / *** ** / *** * * * 5 4 3 2 1 Silence PN SwB SwB+ FS FS+ Acute Cumulative Figure 5: Annoyance rating based on ISO / TS 15666 split into the di ff erent test designs, each n = 21. Indication of significant di ff erences, paired Wilcoxon-test with B enjamini / H ochberg -adjustment; adjusted p -value: * p < 0 . 05, ** p ≤ 0 . 01 and *** p < 0 . 001. Significant di ff erence between silence and other conditions as well as between pink noise and the speech conditions regardless of test design. - the most disturbing situation was reported ambiguous: some participants perceived the reverbant situation as most disturbing, other opine the original speech situation as most disturbing, - in the test design acute the change of stimulus from trial to trial led to irritation; some participants had the impression to perform poorer in the silent condition, if there was another sound condition preceding, - the serial recall test was stated as unrealistic work assignment and as frustrating. 4. DISCUSSION The results confirm the already known decrease in performance by irrelevant sound. Nevertheless, the varying fluctuation strength ( F ) does not seem to have the expected influence on the DP. This is contradictory to the findings by S chlittmeier et al. [4]. A possible explanation could be the influence of the reverbant room on the speech signal and its resulting distortion. Similar findings are reported by L iebl et al. on masked speech and the performance of the F -model [5]. A training e ff ect is visible in the data. On the one hand, possibly too much conditions (six) are used in this experiment: S chlittmeier recommends five conditions for a student sample, less for a adult sample [10]. On the other hand, the subjects could habituate to the very similar stimuli presented (all speech conditions are based on one original stimulus). A habituation is possible, according to B anburry and B erry [14]. Grouping the subjects based on their noise sensitivity, no statistically significant di ff erences are revealed, neither for DP nor for annoyance ratings. Likewise, the working hypothesis could not be confirmed — no statistically significant di ff erences between the DP or the annoyance for speech conditions with and without reverberation are verifiable. Still, there is a decrease in the mean DP in one speech condition (SwB to SwB + : from 10.6 % to 8.9 %), but an increase in the other (FS to FS + : from 8.8 % to 12 %). Even if this is not significant, it also can not be resolved by the acoustic parameters, because they change in the same direction in the reverberant condition (see Table 1). A closer look is needed, to avoid dependencies of the DP to the chosen speech stimulus in future psychoacoustics and room acoustics studies. Annoyance ratings di ff er for silence, pink noise and the speech conditions. But there is no di ff erence in ratings within the speech conditions, regardless of the room acoustic situation. A five-point scale does possibly not lead to a su ffi cient resolution. Interestingly, the mean error rate of the silence and the pink noise condition does not di ff er statistically. However, the annoyance rating between those conditions does di ff er. This is consistent for example with the experiment shown in [5]. The influence of varying room acoustic conditions on cognitive performance and the perception (e.g. annoyance and disturbance) must be better described by future studies. The selection of stimuli for the irrelevant speech and also for the relevant speech sounds can have an influence on the results. Therefore, more findings are needed in this field. Due to the frustration potential of the serial recall test, the participants state that reparticipation in another study with the same work task is considered unlikely. Hence, it is important to use realistic, everyday work tasks for future experiments. ACKNOWLEDGEMENTS A sincere thank you to all colleagues, who participated and supported this study. REFERENCES [1] Industrieverband Büro und Arbeitswelt. I BA S tudie 2019 / 2020 - D ie E ntwicklung der B üroarbeit, 2020. [2] W Ellermeier and K Zimmer. The psychoacoustics of the irrelevant sound e ff ect. Acoust Sci & Tech , 35(1):10–16, 2014. [3] SJ Schlittmeier and A Liebl. The e ff ects of intelligible irrelevant background speech in o ffi ces – cognitive disturbance, annoyance, and solutions. Facilities , 33(1 / 2):61–75, 2015. [4] SJ Schlittmeier, T Weissgerber, S Kerber, H Fastl, and J Hellbrück. Algorithmic modeling of the irrelevant sound e ff ect (ise) by the hearing sensation fluctuation strength. Atten Percept Psychophys , 74(1):194–203, 2012. [5] A Liebl, A Assfalg, and SJ Schlittmeier. The e ff ects of speech intelligibility and temporal–spectral variability on performance and annoyance ratings. Appl Acoust , 110:170– 175, 2016. [6] A Haapakangas, V Hongisto, and A Liebl. The relation between the intelligibility of irrelevant speech and cognitive performance—a revised model based on laboratory studies. Indoor Air , pages 1–17, 2020. [7] J Reinten, PE Braat-Eggen, M Hornikx, HSM Kort, and A Kohlrausch. The indoor sound environment and human task performance: A literature review on the role of room acoustics. Building and Environment , 123:315–332, 2017. [8] ISO. Acoustics – assessment of noise annoyance by means of social and socio-acoustic surveys. Standard ISO / TS 15666:2003, International Organization for Standardization, 2003. [9] K Zimmer and W Ellermeier. Short form of a noise-sensitivity questionnaire [ E in K urzfragebogen zur E rfassung der L ärmempfindlichkeit]. Umweltpsychologie , 2(2):54–63, 1998. [10] S Schlittmeier. Detection of noise e ff ects on short-term memory capacity [ E rfassung von L ärme ff ekten auf die K urzzeitgedächtniskapazität]. baua: Fokus. 1. Fachgespräch Extra- aurale Wirkungen von Lärm bei der Arbeit , pages 6–8, 2018. [11] D Leckschat and C Epe. Recordings of speakers for use in virtual acoustics [ A ufnahmen von S precherinnen und S prechern zur V erwendung in der virtuellen A kustik, V ersion 1.2]. Zenodo , 2020. [12] DGUV. O ffi ce acoustics [ A kustik im B üro]. DGUV Information 215-443, German Social Accident Insurance, 2021. [13] Y Benjamini and Y Hochberg. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Statist Soc B. , 57(1):289–300, 1995. [14] S Banbury and DC Berry. Habituation and dishabituation to speech and o ffi ce noise. Journal of Experimental Psychology: Applied , 3(3):181–195, 1997. Previous Paper 713 of 769 Next