Welcome to the new IOA website! Please reset your password to access your account.

A Method for Separating Knocking Sounds from Engine Radiation Noise by Deep Learning Hikaru Watabe 1 Taro Kasahara 2 Ono Sokki Co., Ltd. 3-9-3 Shin-Yokohama, Kohoku-ku, Yokohama, Japan

ABSTRACT Knocking is the abnormal combustion of a gasoline engine, it generates a metallic noise. Knocking can damage the engine. In engine calibration work, the operator detects knocking by listening to the engine radiation noise to prevent engine damage. There is a need to develop a way to automate this work. We developed the deep learning model which separates Knocking sound from engine radiation noise measured by a microphone. This model obtains the time-frequency mask from the paired data of engine radiation noise and cylinder pressure. The time-frequency mask enables the separation of knocking sound from engine radiation noise. By training various rotation speeds, the proposed model can separate the knocking sound without training target engine speed.

1. INTRODUCTION

In recent years, the engine calibration, which is the process of arriving at optimal settings, has become automated in the development of internal combustion engines in order to shorten develop- ment time and reduce workload. In ignition timing calibration, i.e., one of the gasoline engine cali- bration processes, it is important to avoid from occurring heavy knocking. Knocking is an abnormal combustion in a gasoline engine and generates a metallic impact sound. While heavy knocking dam- age the engine and must be suppressed, small knocking must be tolerated in order to obtain the best performance from the engine, requiring careful adjustment of ignition timing. The size of knocking is evaluated as knocking level, which is composed of knocking sound volume and number of times. Knocking level is generally evaluated by measuring the in-cylinder pressure signal, by the auditory sense of a trained person (expert), or by a combination of both. While the expert's listening senses can detect faint differences and provide flexible evaluations, there are problems such as individual differences, increased workload for calibration work, and a shortage of expert personnel. For these reasons, it is desirable to develop a system that can quantitatively evaluate and automatically judge knocking level on behalf of experts.

We have studied a knocking sound detection method using an outlier detection method (1)(2) . This paper describes a knocking detection method that uses a deep neural network (DNN) to separate the knocking sound from the engine radiation noise (3)(4) .

1 watabe@onosokki.co.jp

2 kasahart@onosokki.co.jp

i, orn inter.noise 21-24 AUGUST SCOTTISH EVENT CAMPUS ? O? ? GLASGOW

2. KNOCKING SOUND SEPARATION FROM ENGINE RADIATION NOISE BY DEEP LEARNING

Equation 1 is an example of sound source separation using DNN when the source signal is known.

𝐘≈𝐘 ෡ = 𝐌⊙𝐗

0 ≤𝐌≤1 𝐘: Clean sound spectrogram 𝐘 ෡ : Estimated sound spectrogram

(1)

𝐗: Observed sound spectrogram

𝐌 : Time-Frequency mask

⊙: Hadamard product

𝐗 is the amplitude spectrogram of the sound observed by the microphone, 𝐘 and 𝐘 ෡ are the amplitude spectrogram of the clean source signal without noise superimposition and its estimated value, 𝐌 is the time-frequency mask that separates the source signal from the observed sound, 𝐌 is a real matrix with elements ranging from 0 to 1. The more the time-frequency components of 𝐗 is attributable to 𝐘 , the closer the elements of 𝐌 approach 1. The hadamard product of 𝐗 and M yields 𝐘 ෡ . If the am- plitude spectrogram of the clean source signal 𝐘 is available in advance, a mixture of the clean source signal and the noise expected in a real environment is used for the DNN training data. When the amplitude spectrogram of the composite signal is 𝐗 , minimizing the error (e.g., | 𝐘 - 𝐘 ෡ |) between the amplitude spectrogram of the clean source signal and its estimated value allows the DNN to generate 𝐌 that separates the source signal from the noise.

i, orn inter.noise 21-24 AUGUST SCOTTISH EVENT CAMPUS ? O? ? GLASGOW

Since the knocking sound is recorded using a microphone installed near the operating engine, it is observed together with various sounds emitted by the engine (Figure 1(a)).

Figure 1: Recording of cylinder pressure and engine radiation noise Therefore, a clean knocking sound without noise cannot be observed. The full amplitude spectrogram of the source signal 𝐘 is unknown. Hence, instead of the clean knocking sound, we considered using the knocking component extracted by high-pass filtering of the cylinder pressure (hereinafter referred to as "knocking cylinder pressure"). Figure 1(b), (c) shows the amplitude spectrograms of the knock- ing in-cylinder pressure and engine radiation noise when knocking occurs.

The DNN(U-Net (5) , Figure 2) is trained to separate the knocking sound by minimizing the loss function in Equation 2.

In-cylinder ——! pressure sensor Cylinder Microphone} eylinder pressure iim 7O Gasoline engine Engine radiation noise Data logger (a) Recording of cylinder pressure nd engine radiation noise 10 2030 “Time {ms} (b) Knocking in-cylinder pressure Time [ms] (©) Engine radiation noise

Figure 2: Structure of DNN (U-Net)

ி

𝐿= − 1

+ 1

𝑁 ෍SI-SNR൫𝑘 ௡ , 𝑘෠ ௡ ൯

𝐹 ෍ቀSNR ௥௬ (𝑓) −SNR ௞෠௬ (𝑓)ቁ

௡ୀଵ

௙ୀଵ

i, orn inter.noise 21-24 AUGUST SCOTTISH EVENT CAMPUS ? O? ? GLASGOW

𝑘 ௡ = ℎ∗𝑦 ௡

SNR(𝑓) = 𝛾 ଶ (𝑓) 1 −𝛾 ଶ (𝑓)

SI-SNR: Scale-invariant signal-to-noise ratio

SNR: Signal-to-noise ratio 𝑘: Knocking sound estimated by convolution

(2)

ℎ: Impulse response estimated from engine radiation noise and knocking in-cylinder pressure

𝑦: Observed knocking in-cylinder pressure

𝑘 ෠ : Separated knocking sound

𝑟: Residual noise 𝛾 ଶ : Coherence function

𝐹, 𝑓: Frequency index

This loss function consists of the scale-invariant signal-to-noise ratio(SI-SNR) (6) between the knock- ing sound estimated by convolution k ௡ and the knocking sound separated from engine radiation noise k ෠ ௡ , and the signal-to-noise ratio ( SNR ௥௬ and SNR ௞෠௬ ) introduced to evaluate the degree of sound sep- aration. The time signals of the separated knocking and residual noise are obtained from the individual amplitude spectrograms and the phase spectrogram of the engine radiation noise. Figure 3 shows the separation results for engines A and B at 1000, 3000, and 5000 r/min. In this case, the DNNs were trained individually for each engine and each rotational speed. The engine specifications are shown in Table 1.

Hadamard product T-F mask Observed sound Knocking sound Noise Amplitude and Phase spectrogram Amplitude spectrogram and Waveform

Engine A Engine B Total displacement 2.4L 1.5L Cylinder configuration Bore In-line 4-cylinder 87 mm 73 mm Fuel injection system Port injection Direct injection

Wn Aiea. e ' is) e@ inter.noise 21-24 AUGUST SCOTTISH EVENT CAMPUS ? O ? Z GLASGOW ° anew kaveskeing Gress a Engine Separated sound Residual noise in-cylinder pressure i radiation noise VC lll lll CO ee 5 => => =

{p] amssaid punog w — we e a a Co 2 ire} e w so a = [zHy] Aouenbory {p] aanssaid punog rea — w e a a C) cy —— re c w co a = [zEpy] Aouonbory {P] amnssaid punog rd e w e a a C) Cy) ire} e w = = [zEpy] Aouenber7 [ap] einssaid A9-ur SUTyOOUY ra s a = ire} e w a = [zpEpy] Aouenbary UILU/S QOOT [ap] oamssaid punog w e s Ss wu = = a re} co ww [zy] Aouanbary [ap] eanssaid punog we e ey c w [zEpy] Aouonbary [ap] oanssaid punog rel e e <= w 2 eg [zy] Aouonberz [ap] amssaid [zpy] Aouenbary OUT OODE

[ap] omssaid punog 10.0 75 wa a re w a [zy] Aouenbory [ap] omssoid punog 100 wu w a 10 [zppy] Aouonbery [ap] omssaid punog wu e va = e 10.0 75 re = [zppy] Aouenbe17 [ap] oimssoid JopurjAo-ur SuTyoouy w re =“ [zppy] Aouenbar7 OTT OOOS Time [ms] Time [ms] Time [ms] Time [ms] [ap] emsseid pun wo 100 = wo w a — [zEpy] Aouonbor, [ap] omssaid pun (a) Engine A [zHpy] Aouenbor, [ap] amssaid w ws a [zEp{] Aouenbar, UnO/T OOLT

Figure 3: Knocking in-cylinder pressure, engine radiation noise, separated knocking sound, and re-

sidual noise at each rotational speed

Table 1: Engine Specifications

OS [ap] omssard punog [ap] omssard e 1 e eo wa e os = 7 s a | sc S o = = = = = = rey sc wa co 3 G c } cs 4 a 4 = = I [zHy] Aoulenbor,y [zEpy] Aouot [ap] eanssord punog [ap] omssaud i La — — PY — = — Ss a = — SS S&S & & & S a s w fel fouonbory are fou [ap] oimssard punog [ap] omssard wo [ iannil Aouonbary ipl fous [ap] ommssaid Lap] ems: uy JoapurfAo-ur Suryoouy Jopur[Ao-ur Se = f [ze] Aouenbery [zEpy] Aouer UIT QOOOE ULUL/L O(

SOC Frequ 4 Time [ms] Freq 4 Time [ms] BR Sound. Frequ (b) Engine B 4 Time [ms] Sound Frequ 4 Time [ms] Sound.

The microphone was placed at a distance of approximately 150 mm in front of the center of cylinders 2 and 3 on the intake side. The engine radiation noise and in-cylinder pressure near the ignition timing of a cylinder were used for analysis. Figure 3 shows the knocking sound and the residual noise after the knocking sound separation of engines A and B at each rotation speed. At all speeds, the knocking sound, which is scattered around 6 kHz and in the 10 to 20 kHz band of the engine radiation noise, is separated. The coherence function between knocking cylinder pressure and engine radiation noise, separated knocking sound, and residual noise are shown in Figure 4. The lower coherence value of the residual noise compared to the separated knocking sound indicate that the knocking sound (or part of it) from the engine radiation noise.

i, orn inter.noise 21-24 AUGUST SCOTTISH EVENT CAMPUS ? O? ? GLASGOW

Figure 4: Coherence function between knocking cylinder pressure and engine radiation sound,

separated knocking sound, and residual noise 3. PROPOSED KNOCKING DETECTION AND EVALUATION METHOD Equation 3 is the anomaly score for knocking detection. The occurrence of knocking is judged against a preset threshold value based on whether the anomaly score exceeds the threshold value or not. Equation 3 is based on the assumption that the expert evaluates knocking level by comparing the knocking sound with other engine radiation noise. The numerator of Equation 3 expresses the maxi- mum value of the knocking sound. The denominator of Equation 3 expresses the average value of the residual noise. In order to reflect the sensitivity of the human ear, the A-weighting is applied to the spectrogram in advance.

Figure 5 shows an example of anomaly score calculation at 1000 r/min of 4-cylinder engine (En- gine A). The rotation speed was kept constant. To adjust knocking level, we vary the ignition timing of only fourth cylinder. The knocking level as determined by an expert is described below graph. The knocking level is defined as None when there is no knocking, and the knock level increases in the

Coherence Coherence ecece es 8 £8 —— Separated Sound ---- Observed Sound ©—— Residual Noise 1000 r/min 3000 r/min 5000 r/min 10 10 08 08 8 g 2 06 206 = 04 = 04 ‘ & s ( 0.2 0.2 q H| oo f 00 LN | 5 10 at 20 E | 10 tJ 20 10 15 20 Frequency [kHz] Frequency [kHz] Frequency [kHz] 1100 r/min (a) Engine A 3000 r/min 5000 r/min 10 15 Frequency [kHz] Coherence 10 ay 20 Frequency [kHz] (b) Engine B Coherence Frequency [kHz]

order of Trace, Light. As shown in Figure 5, the amplitude of the anomaly score increases with the knocking level increasing, and so a threshold can be set to detect knocking.

௧ ∑ k ௙,௧ ி ௙ୀଵ 1 𝑇 ∑ r ௙,௧ ி,் ௙,௧ୀଵ

max

S N ⁄ anomaly score =

(3)

𝑘: Extracted knocking sound spectrogram

𝑟: Residual noise spectrogram 𝐹, 𝑓, 𝑇, 𝑡: Frequency and Time index

i, orn inter.noise 21-24 AUGUST SCOTTISH EVENT CAMPUS ? O? ? GLASGOW

Figure 5: Example of calculation of anomaly score (Engine A) 4. PROPOSAL OF A DNN TRAINING METHOD APPLICABLE TO A WIDE RANGE OF ROTATION SPEEDS THROUGH DATA AUGMENTATION In the methods described so far (hereinafter referred to as "previous methods"), DNNs specialized for each rotational speed are trained. This chapter describes an improved method (hereinafter referred to as "proposed method") that can handle a wide range of rotation speeds (4) . The proposed method trains a DNN with augmented data for multiple rotational speeds so that knocking sounds can be separated even at rotational speeds not included in the training data.

4.1. DNN training method using multiple rotation speed data

The proposed method trains a new DNN using the separation results of the DNNs obtained by the previous method. Figure 6 shows an overview of the proposed method. First, a DNN specialized for each rotational speed is trained using the previous method to separate the knocking sound from the engine radiation noise at each rotational speed. Second, mixed signals are created by adding the sep- arated knocking sounds and the engine radiation noise without the knocking sounds (i.e., the residual noise excluding the knocking sound or the engine radiation noise of the cycle without knocking). At this time, data augmentation (i.e., expansion and contraction of sound, increase/decrease of sound pressure, and addition of noise; the same applies to the other residual noises) is performed for the separated knocking sound and the other residual noises. A new DNN is trained to generate a Time- frequency mask that separates the knocking sounds from input augmented data. The proposed method trains the DNN to minimize the loss function in Equation 4, assuming the source signal is known. 𝐊 ෡ ௡ in the loss function is the knocking sound separated by the previous method, and the DNN can be trained to separate knocking sounds without using clean knocking sounds.

S/N anomaly score 1000 r/min None Trace 000 800 Cyele 1000 1200 Light 1400 None —*

i, orn inter.noise 21-24 AUGUST SCOTTISH EVENT CAMPUS ? O? ? GLASGOW

Figure 6: Overview of model learning method using the proposed method

𝐿= 1

𝑁 ෍ฮ𝐊෡ ௡ −𝐌 ௡ ⊙(𝐊෡ ௡ + 𝐗 ௡ )ฮ ଵ

௡ୀଵ

𝐊 ෡ ௡ : Separated knocking sound spectrogram

by the previous method 𝐗 𝒏 : the engine radiation noise without

(4)

knocking noise

𝐌 𝒏 : T-F mask ⊙: Hadamard product

‖∙‖ ଵ : L1 norm

4.2. Results of knocking sound separation by the proposed method

Figure 7(a) shows a scatter plots of the amplitude spectrograms (average values) of the knocking sound separated using the DNN trained with the previous and proposed methods at 1000, 3000, and 5000 r/min. In this figure, data for all rotational speeds are used as training data for the proposed method. The scatter plots include cycles of normal combustion without knocking and cycles with knocking. The results of the previous method and the proposed method are generally similar for all rotation speeds, and the proposed method is able to separate knocking sounds as well as the previous method. Therefore, we conclude that a single DNN can separate knocking sound at multiple rotation speeds.

|_| i_| 1000r/min eo Nt | i_| 3000r/min bee et |_| a | s000nimin ber oe ngine radiation noise ocking Soun : Data Augmentation Engi diati i ©) Knocking Sound DA: Data A i (Knocking Sound + Noise) El Residual Noise Loss Calculation

Figure 7: Scatter plots showing amplitude spectrograms of knocking sound (average values)

1000 r/min 3000 r/min 5000 r/min 1000 r/min 3000 r/min 5000 r/min gE eS gem pee Ege ose o Hie apy ao ae - Be ass Bass Hy a] on a] 3] rahe "Ra sal os, Bien! & e rae erg Cc iis “as faire Seat pe ss Proposed meth (2) Comparison between the previous and the proposed method (b) Comparison between the previous and the proposed method ‘(training with data from all rotational speeds) (use only 1000 r/min and $000 r/min data and with data augmentation)

Next, we input 3000 r/min data, which is not included in the training data, to the DNN trained with 1000 r/min and 5000 r/min data using the proposed method in order to check the separation performance at rotational speeds not included in the training data. Figure 7(b) shows the scatter plots of DNNs trained on 1000, 3000, and 5000 r/min data by the previous method and a DNN trained on 1000 and 5000 r/min data by the proposed method. The values of the DNN trained with the previous method and the DNN trained with the proposed method (trained on 1000 and 5000 r/min data) are close.

The results suggest that when the DNN is trained with data from the upper and lower rotational speed limits, it is possible to separate knocking sounds even at rotational speeds in between.

5. CONCLUSIONS

In this paper, we described the sound source separation of knocking sound using the DNN even if clean Knocking sound sources cannot be collected. Knocking can be detected and the level of knocking can be evaluated by using an anomaly score calculated from the separated knocking sound and residual noise. From validation results, we find the DNN trained with multiple rotation speed data and augmented data can separate knocking sounds at rotation speeds not included in the training data.

As a future research, we will consider applying this method to other engines using transition learning. In addition, we will also verify the correspondence between anomaly score and expert eval- uation. 6. REFERENCES

1. Taro Kasahara, Masayoshi Otaka, & Kenichi Komaba. Development of a knocking detection sys-

tem using microphones. Journal of Society of Automotive Engineers of Japan, Vol. 47 , No. 6 , p. 1279-1284 (2016) 2. Taro Kasahara, Masayoshi Otaka, & Kenichi Komaba. Development of a knocking detection sys-

tem using microphones (Part 2). Journal of Society of Automotive Engineers of Japan, Vol. 49 , No. 4 , p. 708-713 (2018) 3. Taro Kasahara, Hikaru Watabe, Taichi Ikeda, & Hiroshi Yoshikoshi. Estimation Method of

Knocking Sound and In-cylinder Pressure from Engine Radiation Noise by Deep Learning (Part 3). Journal of Society of Automotive Engineers of Japan, Vol. 52 , No. 2 , p. 263-268 (2021) 4. Taro Kasahara, Hikaru Watabe, Taichi Ikeda, & Hiroshi Yoshikoshi. Estimation Method of

Knocking Sound and In-cylinder Pressure from Engine Radiation Noise by Deep Learning (Part 4). Proceedings of Society of Automotive Engineers of Japan Annual Autumn Congress, No.83- 21 (2021) 5. Andreas Jansson et al.: SINGING VOICE SEPARATION WITH DEEP U-NET CONVOLU-

TIONAL NETWORKS , ISMIR 2017, https://ejhumphrey.com/assets/pdf/jansson2017sing- ing.pdf 6. Y. Luo and N. Mesgarani, "TaSNet: Time-Domain Audio Separation Network for Real-Time,

Single-Channel Speech Separation," 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.696-700(2018), doi: 10.1109/ICASSP.2018.8462116

i, orn inter.noise 21-24 AUGUST SCOTTISH EVENT CAMPUS ? O? ? GLASGOW