Welcome to the new IOA website! Please reset your password to access your account.

Generation and Analysis of Artificial Warning Sounds for Electric Scooters Antonio J. Torija 1 Andrew S. Elliott 2 Lara Harris 3 Zuzanna Podwinska 4 Connor Welham 5 Acoustics Research Centre, University of Salford The Crescent, Manchester, M5 4WT

ABSTRACT Micro-mobility transportation, including electric scooters and e-bikes, could bring substantial benefits to resolving road congestion, provide an eco-friendly transportation system, offer low-cost personal transportation, and increase accessibility. Another benefit is that electric scooters are very quiet, only producing some minimal rolling noise and high frequency emission due to electric motors. Quietness of the vehicles has led to safety concerns being raised by accessibility groups, such as the Royal National Institute of Blind people (RNIB). For this reason, we have teamed up with the RNIB to explore, well-designed Acoustic Vehicle Alerting Systems (AVAS) as a potential solution to ensure vehicle detection and avoid potential conflict with pedestrians. This paper presents the key findings of a research project carried out by the University of Salford’s Acoustic Research Centre, in collaboration with Dott Scooters and RNIB. The goal of this project was to develop a stand-alone awareness sound system for electric scooters, with the main constraint of ensuring an appropriate balancing between vehicle awareness and noise annoyance. Based on preliminary results, this project concluded that a significant benefit, in terms of vehicle noticeability, is observed with the addition of an awareness sound. Funding has been secured to carry out further research to optimise the AVAS for complex urban environments.

1. INTRODUCTION

Electric scooters (or e-scooters) are becoming more and more popular due to the substantial benefits they bring, for instance, low-cost and time-efficient personal transportation, reduction of carbon dioxide and air quality emissions, etc. Regarding their acoustic characteristics, the wider adoption of e-scooters can lead to lower noise emissions (compared to internal combustion engine (ICE) vehicles). However, the quietness of e-scooters can lead to safety concerns due to the vehicles not

1 A.J.TorijaMartinez@salford.ac.uk 2 A.S.Elliott@salford.ac.uk 3 L.E.Harris@salford.ac.uk 4 Z.M.Podwinska@salford.ac.uk 5 C.J.Welham@edu.salford.ac.uk

a Shea mar ce 21-24 AUGUST SCOTTISH BENT caso

being detected by pedestrians. This is particularly important for blind and low vision people, and several associations across Europe are advocating for the addition of mandatory warning sounds in all electric vehicles [1]. Sekine et al. [2] investigated the detectability of Electric Motorbikes (EM) and Internal Combustion Engine Motorbikes (ICEM) operating at 10 km/h and 20 km/h (with the EM 15 dB quieter than ICEM). Based on the results of a subjective experiment, the authors found that the average detectability distance was: 57.9 m (ICEM) and 11.7 m (EM) at 20 km/h; and 33.6 m (ICEM) and 5.4 m (EM) at 10 km/h. Research studies have shown that the addition of artificially generated sounds can ensure that electric vehicles are detected at a significantly greater distance than without artificially added sound [3]. Regulations for the audibility of Hybrid Electric Vehicles (HEVs) and Electric Vehicles (EVs) already exist, for instance, FMVSS No. 141 in the United States or UNECE R138.01 in the European Union. These regulations describe the minimum requirements for warning sound signals, including the speed range, minimum third-octave levels for nonadjacent bands and frequency range, pitch shifting, and operating conditions (e.g., alert sounds during reverse operations or while the vehicle is stationary). However, it is unlikely that these regulations can be directly applied to e- scooters, as their size and operating conditions are significantly different to electric cars or buses. The implementation of psychoacoustic methods to design the warning signals can lead to important benefits to increase vehicle detection. Fiebig [4] suggested that audible amplitude modulations seem to be beneficial for increased detectability and localizability. Yasui and Miura [5] found that amplitude modulated warning sounds can lead to the detectability as ‘static’ equivalent sounds even when they are 3 dB quieter. Fiebig [4] also discussed that psychoacoustic principles should be considered to find the right balance between detectability and annoyance in the design of warning sounds. This paper presents the results of an on-going research project in collaboration with Dott and the Royal National Institute of Blind people (RNIB). Specifically, this paper overviews the preliminary results of a feasibility study carried out between May-July 2021, focused on:

• Developing a system (including hardware and software) to generate an awareness sound, as a function of the e-scooter’s operating conditions (e.g., vehicle speed). • Carrying out a subjective study to investigate pedestrian awareness of an approaching e- scooter with and without a warning sound added. Three warning sounds were tested: broadband, broadband plus tonal, and broadband plus amplitude modulated tonal. • Define the next steps for designing optimal awareness sounds to maximise vehicle noticeability without increasing noise annoyance.

2. GENERATION OF WARNING SOUNDS

The awareness sound generated by the scooter should first and foremost be noticeable for reasons of pedestrian safety, but it should ideally also be pleasant, or at the very least inoffensive, whilst adding the minimum amount of additional unnecessary environmental noise. Note that environmental noise is regarded as a significant health concern, see for example reports from the world health organisation [6]. In this study we have chosen, through discussions with the e-scooter manufacturer (Dott) and the RNIB, to investigate the noticeability and sound quality of broadband, tonal and amplitude

modulated sounds. We have however included a further sound which may provide a happy medium between tonal and impulsive: amplitude modulated tonal sound. The awareness sounds are synthesised within a python environment, version 3.8.10. All sound synthesis involves taking various measurements from the scooter. The measurements currently available for use are accelerometer data, current measurements, and the rotational speed of the wheels. In the up-to-date implementation, only the accelerometer data is used. While creating the sounds and initial script, the process was not run in real time. Instead, it was opted to use an already recorded dataset, and simply run that through the processes mentioned in the following paragraph. This allowed for all effects and data-handling to be tested thoroughly and to ensure the project could be worked on away from the scooter. The signal chain for the synthesis process is as follows: The accelerometer data is fed into python and run through a 10th order digital low-pass Butterworth filter from the scipy signal library, with a critical frequency of 0.5 Hz to remove any unnecessary high-frequency data and potential noise present in the signal, the signal does have some resonances at higher frequencies, so it is important to remove these as quickly as possible. From here the only other libraries used are simpleaudio for audio output, and numpy for array manipulation. Following this, the signal is run through a half-wave rectification process, which involves summing the input signal with the absolute value of the input and dividing by 2. This is used to increase the upper-harmonic content of the accelerometer data and helps to add clarity back to the signal after being filtered. The final stages involve adding delay-based effects to the accelerometer data. These include a standard delay algorithm, which simply feeds the input back into the data after a predetermined number of samples, and a flanger effect which feeds a modulated-delayed signal back into the data also. The flanger primarily acts as a way to give the signal movement, as it can sound very static without any treatment. The data is then normalised for audibility and is outputted using the “simpleaudio” library. All the above is achieved offline, using an array of previously recorded data. See Figure 1 for the overall flow of signal generation.

Figure 1 : Offline signal generation using accelerometer data, specifics of each block are explained in the text.

For the subjective listening test, sound was synthesised using a Digital Audio Workstation (DAW). These initial stages of sound synthesis were used to better understand what types of sounds would be preferable for a warning system without having to worry about coding each element initially. These sounds were designed with the limitations of python in mind. Therefore, only simple waveforms and basic delay-based effects were used when needed. Including this, it allowed testing of signals which may be created, during the next steps of the project. These include standard waveforms which increase in pitch as the scooter increases in speed. The sounds were separated into three distinct categories: broadband, broadband with tone, broadband with modulated tone. In designing these sounds, the broadband signals typically had a low-pass or band-pass filter with a cut-off which increased as vehicle speed increases. The tones were typically simple waves such as sine, square or sawtooth. This was done to keep python implementation as quick as possible. Finally, amplitude modulation was achieved through using the tachometer data. As this was a direct reading of the rotational speed of the wheel, this led to the frequency of the amplitude modulation to increase as the scooters speed increases. Figure 2 displays the sound generation process for the warning sounds used in the subjective experiment.

Figure 2 : Subjective listening test signal generation. All samples used in testing were created through Ableton 11. This was done to allow a greater spread of distinct sounds in a quicker time frame.

3. SUBJECTIVE EXPERIMENT

An immersive listening experiment was conducted to investigate the effectiveness of the awareness sounds. It used 360-degree videos presented through a VR headset and spatial audio reproduced over headphones.

The sound and video stimuli were recorded in two separate locations: MediaCityUK and Peel Park in Salford (see Figure 3 ). The first was an open urban area, including pedestrians, cyclists, hospitality noise and music. The second was a quieter park area, with a play area, less pedestrian and cyclist activity, foliage noise and distant road traffic noise. These two locations were selected to test the awareness sounds in two different locations in terms of levels of activity (i.e., distractions) and background noise. The 360-degree video stimuli were recorded using the Insta360 Pro 2 camera. The 1 st order Ambisonic audio stimuli were recorded using a Soundfield ST450 microphone, with a Zoom F8n Field Recorder. The Ambisonic microphone was placed directly beneath the 360 camera to ensure the audio matched the video as best as possible without the microphone being seen by the 360 camera. A series of scooter pass-bys were recorded, with the scooter operating at constant maximum speed (i.e., 15 mph or 24 km/h), and approaching the camera (i.e., the simulated pedestrian) from behind at different angles. Stimuli without e-scooters were also recorded (i.e., no vehicles passing by, or other vehicles passing by such a bicycle). In the subjective experiment, the recorded sound and video stimuli were used to mimic scooter pass-by events both with and without the developed awareness sounds (i.e., with only ‘baseline’ sound from the real recordings, vs., baseline + added awareness sound).

Figure 3 : Recording locations: Peek Park (left) and MediaCityUK (right).

The audio-visual scenes were presented to the participants via the Oculus Quest 2 VR headset, using a Focusrite Scarlett 2i4 audio interface and Beyerdynamic DT 250 headphones. The VR visual scenes and the recorded Ambisonic audio were synchronised, and 20-second-long clips were selected, some of which included an e-scooter pass-by. Audio and video were combined in Unity and the Ambisonic recordings were decoded using the Steam Audio plugin, which provides binaural spatialisation with head-related transfer functions using head-tracking from the VR headset. In scenes with an added e-scooter sound, an audio object was created in Unity which followed the movement path of the e-scooter in the recording. The audio object was also spatialised with Steam Audio. The different awareness sounds were carefully designed to not increase significantly the sound emitted by the e-scooter, and therefore, not contribute to an overall increase in noise annoyance. Table 1 shows the L Aeq at the participant position for the e-scooter stimuli with and without added

awareness sounds. Note that L Aeq,20s at the Peel Park scenes are showing as this was the quietest scenario tested. Table 1 therefore supports the statement of generating effective awareness sounds without (significantly) increasing the overall sound pressure level. Table 1 : Sound pressure level (L Aeq,20s at the participant position) of the different stimuli tested in the experiment for the Peel Park scenes.

Stimuli L Aeq,20s

(dBA) e-scooter without added sound 46.13

e-scooter + broadband sound 48.09

e-scooter + broadband plus tonal sound 48.09 e-scooter + broadband plus modulated

tonal sound 48.47

Each experimental session consisted of two parts. For the first part of the experiment, which studied noticeability of the e-scooter awareness sounds, the participants sat in a room wearing a VR headset and headphones and responded using the Oculus Quest 2 controller. In each experimental trial, they were shown one of the 360° video scenes and a short text excerpt, taken from the DeepMind Q&A Dataset (see Figure 4 ). They were asked to read the text and told they would be asked a question about it afterwards. At the same time, they were asked to press a button on the controller as soon as they detected a moving hazard, which was defined as anything that could potentially cause harm to the person if they were really in the situation displayed in the video. When the video was finished, a question about the text was displayed in front of the participant, with 4 possible answers, and the participant chose the answer they thought was correct using the VR controller. This task was included to focus the attention of participants on something other than potential hazards in the scene. The intention was to create a distraction and increase cognitive load, thereby increasing the need for a more effective alert. A short practice session was provided at the beginning to familiarise participants with the task and ensure that they were able to read the text clearly, and that the VR headset was comfortable and secure.

Figure 4: Example video scene in the noticeability experiment.

Each session consisted of 20 trials. The independent variables studied in the experiment were environment (Peel Park and MediaCityUK) and warning sound (broadband sound, broadband plus tonal sound, and broadband plus modulated tonal sound). All participants were shown 3 video scenes from each environment, once with an awareness sound, and once without. Which video scene was matched with which awareness sound was randomised for each participant. Additionally, they were shown 4 video scenes from each environment which did not have an e-scooter in them – some with a bicycle pass-by, and some without any moving hazard. These were included to make the task less predictable. The order of presentation of the trials was randomised for each participant, and the same video scene was never presented twice in a row. The second part of the experiment was to study participants’ preference for the three awareness sounds. Before they started this task, participants were debriefed about the purpose of the experiment and told that we are studying e-scooter sounds. They were shown a user interface written in MATLAB, which included three buttons which allowed the participants to listen to each of the three sounds as many times as they wished. Then, they were asked to rank them from the most preferable to the least preferable, in terms of which sound they would most like to hear as a pedestrian, or which was the most pleasant. Finally, they were asked to leave a short comment justifying their choice (this was optional).

4. RESULTS

4.1. Participant Sample

A total of 15 people completed both parts of the experiment. All were right-handed so responded to reaction-time task with their dominant hand. Most people were aged between 18 and 44, with the sample quite evenly distributed by age and gender. Most participants, 9 of the 15, said they were

not native British English speakers, so the text-based distractor task may have been more challenging for them.

Participants were told that they should have self-reported normal hearing to take part. People with low vision were not explicitly targeted for recruitment. Five people reported a visual impairment, but these were minor e.g., short-sight, and they were allowed to wear glasses during the experiment if they needed them.

4.2. Noticeability results

For the noticeability analysis, data was excluded from anyone clicking in fewer than 50% of trials as this meant they were not performing the task as expected. Three participants were excluded on this basis. Everyone else responded (clicked) in 100% of trials containing e-scooters. Response times were recorded for each scene containing an e-scooter pass-by, calculated from the beginning of a scene until a button press.

The e-scooter pass-by occurred at a different time in each video scene, so the data was analysed by comparing responses to the same scene with and without an awareness sound (baseline). Figure 5 shows the benefit of introducing the three warning sounds, calculated as the difference in response time between the same video scene with and without an awareness sound. Positive values mean that response time was shorter with the awareness sound; negative values mean that a person’s reaction was slower with the added sound. It appeared that sound type and environment had an influence on reaction times, but further analysis was required.

Figure 5: Boxplots of the benefit of introducing an awareness sound, calculated for each participant. The horizontal lines show the median, and the shaded boxes show the 25th and the 75th percentile of the data.

A mixed-effects linear model was fit to the data using the R package lme4, with Scene as a random effect; this accounted for the different ‘baseline’ response times due to the varying timestamps at which e-scooter pass-by occurred in video clips. An ANOVA analysis on the contribution of each

Ginn pweee —

variable showed that neither the type of environment, nor the interaction between Sound and Environment were statistically significant. However, Sound was a significant predictor of response time (p = 0.03).

To find out which awareness sounds showed a significant benefit in response times, contrasts were calculated using the ‘emmeans’ R package. The results in Table 2 show that only the awareness sound with modulated tones produced a statistically significant difference from having no awareness sound.

Table 2 : Results of the contrast analysis with the contribution of each sound tested to the response time. Estimate shows the predicted reaction time advantage compared to having no added awareness sound. Broadband + modulated tones was concluded to be statistically significant, with approximately half a second reduction in response time.

contrast estimate SE df t.ratio p.value

broadband – no_sound -0.09 0.16 124 -0.53 0.882

broadband+tones – no_sound -0.18 0.17 124 -1.07 0.570 broadband+modulated_tones –

no_sound -0.48 0.16 124 -2.91 0.012

4.3. Preference results

Data from all 15 participants was analysed for the Preference task. In this task, listeners were asked to rank the 3 awareness sounds, from the perspective of a pedestrian rather than e-scooter rider: ‘The one you would least like to hear walking down the street is least preferable. The one you would be most happy to hear is most preferable’. Analysis compared the distribution of ranks given to the three different sounds, where rank 1 is the most preferred sound, and rank 3 is the least preferred sound. Results in Figure 6 show that the broadband sound was ranked most often as first choice (most preferred). The sound that showed a response time benefit, broadband plus modulated tones, was most often ranked as second option for listener preference.

Figure 6 : Distribution of preferences for the three awareness sounds. Broadband + modulated tones was most often selected as rank 2 (the middle option) for listener preference.

In the free input field, participants characterised the broadband sound as “more relaxing”, “more soothing”, “least annoying – continuous and not tonal”, “almost like the sea” and said it “sounds more like traffic noise and a stream rushing”.

The broadband plus tones sound was described as a “buzzing sound that is very irritating”, “deafening as it emitted a continuous loud hum”, “most annoying as it was strongly tonal”, “too annoying”. Two participants expressed that they would not want to listen to it regularly: “I could not listen to it for long periods of time as it was not pleasant to listen to” and “very grating on the ears and I would definitely not want to listen to that every day!”.

The sound that demonstrated a benefit in reaction time when used as an alert, broadband plus modulated tones , had the most mixed responses. Participants described it as “reasonably ok”, “a bit annoying”, “in the middle - it was annoying because of the variation” and wrote: “I’m still not very comfortable with [it], but that seems was better than the rest of choices”. One participant thought it was “really distinctive and very annoying”. Participants also commented that it “sounded futuristic”, “sounds like something from Tron which is ok but not entirely what I'd want to hear in the environment” and “a bit scary as it sounds like a beam but it is the one I would choose”.

5. CONCLUSIONS

This paper presents the results of a feasibility study carried out for the development of a system for the generation of acoustic awareness signals for electric scooters. The system, consisting of both hardware and software, generates an awareness sound as a function of the e-scooter’s operating conditions (i.e., vehicle speed). The awareness sound generation has been developed using a Raspberry Pi computer, within a Python environment. After careful consideration of different options for the sound generation, the system presented in this paper amplify and radiate sound according to a voltage input at line level. A laboratory study has been carried out to gauge pedestrian awareness of an approaching e-scooter without and with added awareness sounds. The broadband plus modulated tones sound has been found to decrease the detection time of the approaching e-scooter by 0.48 seconds (compared to the e-scooter without any added sound). With the e-scooter moving at 24 km/h (or 15 mph), this

translates to noticing it at a distance 3.2 meters further away than when there is no added sound. The broadband plus modulated tones sound performed best in terms of noticeability and was also ranked as second in terms of preference. Based on preliminary results of this feasibility study, amplitude modulation seems a very efficient acoustic feature to increase vehicle noticeability. The results of this feasibility study indicate that a good compromise between noticeability and annoyance can be achieved with a well-designed warning sound. Further research is needed to design awareness sounds with an optimal balance between noticeability and annoyance. 6. ACKNOWLEDGEMENTS

The authors would like to acknowledge the funding provided by the University of Salford’s Higher Education Innovation Fund and emTransit B.V (Dott). The authors would also like to acknowledge the contributions of Rory Nicholls and Bernard Steer for the recording of audio-visual stimuli. 7. REFERENCES

1. Pierce, B. A report on the quiet car emergency. The media weigh in. Braille Monitor , 50(7), (2007). 2. Sekine, M., Sakamoto, I., Houzu, H., Nishi, T., & Morita, K. (2017, December). Necessity of acoustic vehicle alerting system for electric motorcycle to ensure pedestrian safety. In INTER- NOISE and NOISE-CON Congress and Conference Proceedings , 255(5), 2187-2198 (2017). 3. Kim, D. S., Emerson, R. W., Naghshineh, K., Pliskow, J., and Myers, K. Impact of adding artificially generated alert sound to hybrid electric vehicles on their detectability by pedestrians who are blind. Journal of Rehabilitation Research and Development, 49(3), 381-393 (2012). 4. Fiebig, A. Electric vehicles get alert signals to be heard by pedestrians: Benefits and drawbacks. Acoustics Today , 16(4), (2020). https://doi.org/10.1121/AT.2020.16.4.20 . 5. Yasui, N. and M. Miura. Use of fluctuation on warning sound for approaching electric vehicle. Acoustical science and technology, 41(2), 513-516 (2020). 6. World Health Organisation. Environmental noise guidelines for the European region (2018).