Welcome to the new IOA website! Please reset your password to access your account.

Proceedings of the Institute of Acoustics

 

 

A study of different tapering windowing functions and referencing curve for the improvement of sound field using wave-field synthesis

 

Amrita Puri1, Indian Institute of Technology Jodhpur, Jodhpur, India

Akash Kumar2, Indian Institute of Technology Jodhpur, Jodhpur, India

 

ABSTRACT

 

Wavefield synthesis is a well-known sound field reproduction technique. The correct synthesis of the desired sound field ideally requires a continuous source distribution over an enclosed surface. Due to practical limitations, a finite length of secondary source array is used. This results in distortion in the reproduced sound field due to the diffraction effect from the edge loudspeakers. To alleviate this problem, tapering window functions are utilized. In this paper, the influence of various windowing functions on the reproduced fields for different secondary source geometry and virtual source type is studied, and different parameters are used to make a quantitative comparison between the performance of different windowing functions. Further, the effect of the different referencing curves for the synthesis of correct amplitude is also presented. From the simulation results, it was found that the Tukey window function provides a wider sweet spot area with a minimal amplitude ringing effect. The selection of a proper referencing curve depends on the array geometry and the type of virtual source.

 

Keywords: Wave-field synthesis, Windowing function, reference curve.

 

1. INTRODUCTION

 

Wavefield synthesis (WFS) is an audio reproduction technique whose primary goal is to recreate the desired sound field in an extended space [1]. Theoretically, to achieve this, a cluster of loudspeakers, controlled by separate audio channels is required to enclose a region of interest, which is impractical. Generally, a finite length of loudspeaker array is employed by imposing appropriate approximation in the theoretical framework to recreate a right sound-field in the listener plane [1].

 

The use of fixed array length results in the amplitude ringing/rippling effect in the reproduced sound field, which is explained by the diffraction phenomenon of the wave [2]. To reduce this error, tapering windowing functions are generally used [2]. According to the author's knowledge, no quantitative study is done to investigate the usefulness of the different windowing functions for different virtual source types like plane waves and a point source.

 

Implementing a linear array of the loudspeaker in place of a planar array can produce correct amplitude at distinct points in the listening plane termed as reference points [3]. This paper also examines the combined effect of different reference points and windowing functions.

 

Different theories used in this paper are discussed in Section 2. Numerical simulation to analyse the effectiveness of different windowing functions is discussed in Section 3. Section 4 concludes the work.

 

2. THEORY

 

In this section, the theory of WFS is briefly discussed, then different error measuring parameters are defined to make a quantitative comparison among different windowing function.

 

2.1 Review of Theory of WFS

 

2.1.1 WFS

 

The theory of WFS based on Rayleigh 1St integral states that for a free field wave propagation, the wave field created by a planar distribution of monopole secondary sources in one of the half region can be written as the surface integral of source strength D3D (𝑿𝟎, 𝜔) and free field Green’s function G3D(𝑿|𝑿𝟎, 𝜔)[4]. It can be expressed as,

 

 

where 𝑿 is the field point and 𝑿𝟎 is the location of monopole sources at boundary and P3D (𝑿𝟎, 𝜔) is field created in 3D half-space in front of the array.

 

The Green’s function for a monopole source can be defined as [4],

 

 

and the driving function is defined as,

 

 

where is unit inward normal to the array.

 

For the synthesis of sound field using a linear distribution of secondary sources, the surface integral in Equation (1) is approximated by stationary phase approximation(SPA) which states that under high frequency assumption, the secondary sources which lie in the listener plane has more contribution to the field in the listener plane. These points act as stationary points in the surface integral of complex exponential integrand in Equation (1) and this approximation imparts a correction term which takes care of correct synthesis at specific locations in the listener plane called reference curve [4]. For discrete and fixed number (N0) of secondary sources after incorporating correction term C1, Equation (1) can be approximated for field P2D(𝑿𝟎, 𝜔) in the listener plane as:

 

 

where Δx0 is the spacing between consecutive loudspeakers of an array. 𝑿ref is the locations in the listener plane where relatively accurate amplitude can be reproduced. Wave number, 𝑘 = |𝑲| = 𝜔 / 𝑐.

 

The driving function for the synthesis of a desired sound field in the listener plane using secondary sources placed in the listener plane can be written as:

 

 

2.1.2 Driving function for a plane wave reproduction

 

The propagating plane wave can be expressed in spatial frequency or wave number domain (𝑲) as:

 

 

For wave field synthesis using a curve array, the driving function is modified by incorporating the secondary source selection term a(x0) [3] which can be defined as: a(x0) = 1, () > 0 ; otherwise, a(x0) = 0. The driving function can be obtained using Equations (5) and (6) as:

 

 

where 𝐵(𝜔) represents the amplitude of monochromatic virtual/desired plane wave, represents the propagation direction and  is the unit inward normal vector at the boundary.

 

2.1.3 Driving function for a point source

 

Pressure field (𝑃SW) created by a point source at 𝑿s can be written as:

 

 

The desired driving function can be obtained by substituting Equation (8) into Equation (5) as:

 

 

where 𝑿0s is the position vector from virtual point source to the boundary.

 

2.2 Indices to quantify error in reproduced sound field

 

2.2.1 Secondary Source utilization factor (SSUF)

 

When a tapering window (𝑊(𝑿0)) is used, it reduces strength of some secondary sources, which results in low intensity of sound in the listener plane. In this paper, SSUF is used to measure the utilisation of secondary sources when a particular window function is used relative to the case when no tapering window is used.

 

 

where 𝑁0 is the total number of secondary sources over which tapering window function is applied

 

2.2.2 Normalized mean square error of pressure (NMSEP)

 

To measure the global amplitude error in the reproduced sound-field, NMSEP is used. It is used to measure the deviation of pressure amplitude of reproduced sound field from that of ideal sound field. The test data points selected for the analysis is shown in

 

Figure 1. NMSEP is defined as,

 

 

where  and are pressure amplitudes of reproduced and ideal sound fields at 𝑚th test data point.

 

2.2.1 Localized root mean square error (LRMSE) and Average LRMSE (ALRMSE)

 

The use of discrete and finite array length results in unevenness in the reproduced pressure amplitude. From Figure 4, we can observe waviness/ringing effect in the pressure amplitude. To quantitatively measure these distortions locally, LRMSE is used in this paper.

 

 

where 𝑀n is the number of data points in the 𝑛th circle and (LRMSE)n is the localized root mean square error of pressure amplitude in the 𝑛th circle (Figure 2).

 

 is the mean pressure amplitude in the 𝑛th circle and it is expressed as,

 

 

where 𝑃0(𝑿𝑚)𝑛 is the pressure of 𝑚th point in the 𝑛th circle.

 

The average of LRMSE for pressure of all circular regions is taken as:

 

 

2.3 Indices to quantify ringing effect in reproduced sound field

 

2.3.1 Mean pressure amplitude

 

The average of absolute pressure at y=2m along the length of array (from x= -2.5m to 2.5m).

 

 

where 𝑃(𝑿𝑚, 2) is the pressure along the x-axis at y = 2m and M is the total number of sampled data points at equal intervals between 𝑥 = −2.5 to 𝑥 = 2.5m. 𝑃ref = 20 𝜇𝑃𝑎.

 

2.3.2 Root mean square error

 

This parameter is generally used in the literature to measure the amount of error in the reproduced sound field.

 

 

RMSE is not able to effectively capture rippling effect (decrease and increase in pressure amplitude) in the reproduced sound field. To quantify this effect, two new parameters, waviness number (W. No.) and waviness score (W. S.) are introduced in this paper.

 

2.3.3 Waviness number (W. No.)

 

Waviness number is defined as the number of extremum points (maxima and minima) of pressure amplitude (Refer Figure 6) along the x-direction at y=2m, provided the difference between consecutive extremum points should be greater than 0.5dB, (below 0.5dB change in sound level is not detectable by human being [5]).

 

2.3.4 Waviness Score (W. S.)

 

Consider two cases, for the first case, pressure difference between two extrema is at a distance l and for the second case, it is l2, and l1 < l2. Experiencing sound field of the first case is more annoying than that of second because SPL changes more quickly when listener moves her head. To take care of such situations while comparing the influence of different windowing functions, ‘waviness score’ parameter is defined as:

 

 

where Δ𝑝amp is the difference between the SPL at consecutive local minima and maxima of pressure amplitude (Δ𝑝amp > 0.5 dB) along the prescribed line of analysis, and Δx is the distance between two geometrical locations at which Δ𝑝amp is observed.

 

3. Numerical study

 

To analyse different windowing functions and reference curves, three arrays of 35 monopole sources having inter-source distance of 10 cm (shown in figure 1) are chosen for the numerical study. For elliptical arcs, semi-major axis and semi-minor axis are 2.5 m and 1.25 m, respectively. Figure 2 shows locations of test data points to calculate global normalised mean square error. Figure 3 shows circles used to calculate localised root mean square error. Two cases of sound fields are considered: (a) an inclined plane wave with 𝜃pw = 70o and (b) a point source at coordinates of (-0.8, -0.6).

 

 

Figure 1: Array geometries use for analysis. A virtual point source at 𝐗S(−0.8, −0.6).

 

 

Figure 2: Location of test data points (blue dots) to analyse global pressure and intensity in listener plane.

 

 

Figure 3: Circles used to analyse reproduced sound field locally (represents surrounding of an individual).

 

 

Figure 4: Pressure amplitude at test points 𝑿m of an inclined plane wave of frequency 700 Hz and 𝜃pw = 70o with a rectangular window.

 

 

Figure 5: Different windowing functions used for analysis.

 

From Figure 4, we can observe a large amount of rippling effect in the sound pressure level (SPL (dB)) when a rectangular window is used. In this section, performance of different tapered windowing functions (shown in Figure 5) are analysed to reduce these rippling effects.

 

Case I: An inclined plane wave

 

 

Figure 6: SPL along x-direction at y = 2m for an inclined plane wave (PW) of frequency 1000 Hz reproduced using linear and elliptical secondary source array geometries with different tapering windowing functions.

 

Figure 6(a) shows sound pressure levels along x-axis at y = 2 m for a synthesised inclined plane wave using a linear array for seven different windowing functions. Reference curve is taken as parallel to x axis at y=2m. It is observed that when a rectangular window is used, there are significant ripples in the reproduced sound field, especially in the left side (around x = -1 and x = -2). When Tukey (β = 0.75), Tukey (β = 0.50), Tukey (β = 0.25), cos3 and cos2 are used, there are significant reductions in the ripples. Listening area obtained using Tukey (β = 0.75) window is more than that obtained by cos2 or cos3 window.

 

Figure 6(b) shows SPL of a reproduced inclined plane wave using linear and elliptical arc arrays with three different windowing functions (rectangular, cos2 and Tukey (β = 0.5)). When a rectangular window is used, ripples are observed for linear as well as for elliptical arc arrays. Compared to a linear array, a curve array of secondary sources introduces a significant amount of rippling effect in the sound field even when Tukey or cos2 windowing function used. Tukey (β = 0.5) window provides larger listening area as compared to cos2 window for all the three cases of arrays. From Figures 6(a) and 6(b), it can be concluded that for a finite length array (truncated), it is better to use a linear array instead of a curve array because a linear array provides a larger sweet region (less rippling effect and larger mean value) even for a less tapered window function, (Tukey (β = 0.75)).

 

 

Figure 7: (a) Mean value of SPL and (b) Localised root mean square error (LRMSE) in circles, four different windowing function

 

Figure 7 shows mean value of SPL and LRMSE in circles (shown in Figure 3) for four different windowing function (Rectangular, Tukey (0.75), Tukey (0.5) and cos2). It is seen from figures 7(a) and 7(b) that for circles inside zone 1 (3, 4, 9, 10, 16, 17, 24 and 25 etc.), SPL as well as LRMSE is almost same with all the four windows. However, at other circles (inside zone 2), mean SPL as well as LRMSE is lesser for cos2 window as compared to Tukey (0.75) window. This means that cos2 is the most effective window in reducing LRMSE but it also reduces effective listening area whereas Tukey (0.75) (or Tukey (0.5)) reduces the LRMSE but it provides relatively larger listening area. Therefore, it can be concluded from figures 7(a) and 7(b) that Tukey (0.75 or 0.5) is an optimal choice of a window function for synthesis of an inclined plane wave.

 

Table 1 shows normalised mean square error (dB) and average localised root mean square error (dB) for three different reference curves and nine windowing functions. Rf1 represents a reference point of coordinates (0, 2), Rf2 corresponds to a reference line parallel to the array, offset by 2 m and Rf is a line taken as parallel to the plane wave-front, left end of the reference line is 2 m above the linear array. It is observed from Table 1 that for a given reference curve, as tapering of the windowing function increases, NMSE increases and ALRMSE decreases. NMSE represents global deviation of synthesized sound field from the ideal sound field and ALRMSE represents deviation from the localized mean value. Therefore, it can be concluded that ALRMSE is a good measuring parameter to quantify diffraction effects (or smoothness in reproduced sound fields). For all the cases of windowing functions, Rf3 leads to minimum values of ALRMSE. For Tukey (β = 0.75), secondary source utilisation factor is 0.84 and ALRMSE is 45.70 whereas for cos2, SSUF is 0.48 and ALRMSE is 43.24. Therefore, numerical values shown in Table 1 also supports the observation that Tukey (0.75 or 0.5) can be an optimal choice of windowing function.

 

Table 1: Different performance matrices to quantify the error in reproduction of an inclined plane wave using nine different tapering windows and three different reference curves.

 

 

Case II: A virtual point source

 

 

Figure 8: SPL at y = 2 m along x-direction for a virtual point source (SW) at (-0.8, -0.6) reproduced using linear and elliptical secondary source array geometries using rectangular and Tukey windowing functions.

 

Figure 8 shows SPL at y = 2 m along x-axis for a virtual point source at (-0.8, -0.6) reproduced using the three different arrays with rectangular and Tukey (β = 0.5) windowing functions. Reference curve is a circular arc having a radius, 𝑟 = (max(|𝑿s − (𝑿0)A|,|𝑿s − (𝑿0)|)) + 0.3 where, (𝑿0)A and (𝑿0)B are the end points of the array in the negative and positive side of x-axis and 𝑿s is desired location of the virtual point source. The theory of stationary phase approximation (SPA) [4] in WFS is used for finding reference points on the reference curve for each secondary source. It is observed from Figure 8 that use of Tukey window (for all the three cases of arrays) leads to significant reduction in ripples and linear array provides the maximum listening area followed by an elliptical arc array with inward normal (Eup) and then, Edw.

 

From figures 6 and 8, it is seen that ripples are more in the left side of the array for an inclined plane wave (of angle 70°) as well as for a virtual point source located at (-0.8, -0.6). Ripples are formed due to diffraction of sound waves or truncation effects. Strength of diffracted waves depends to actual strength of desired sound field near the end of array, and angle between desired sound field and array. For an inclined plane wave, it is speculated that diffraction is more at the left end due to smaller angle between normal to desired plane wave-front and a linear array. For a virtual point source, it is speculated that actual source strength of desired sound field is more near the left end of array.

 

Table 2: Performance parameter for the reproduction of a virtual point source.

 

 

Table 2 shows different performance matrices to quantify the error in reproduction of a virtual point source using rectangular and Tukey windows, and different reference curves. Rf1 and Rf2 are same as taken in Table 1 and R4 is a circular arc centred at 𝑿S having a radius, 𝑟 = (max(|𝑿s − (𝑿0)A|,|𝑿s − (𝑿0)|)) + 0.3 where, (𝑿0)A and (𝑿0)B are the end points of the array in the negative and positive side of x-axis and 𝑿S is desired location of the virtual point source. It can be observed from Table 2 that mean pressure amplitude and RMSE are almost same for both the windows for different cases of array geometry whereas waviness number and waviness score are very different for rectangular and Tukey windows. This shows that conventional error measuring techniques are not able to capture performance of a windowing function whereas proposed indices of waviness number and waviness score clearly represent reduction in diffraction effect/rippling with the use of a tapering windowing function.

 

4. CONCLUSIONS

 

In this paper, a comparative study of use of different tapering windowing functions and different reference curves for reproduction of a sound field using wave field synthesis is carried out. The use of a windowing function reduces ripples in the reproduced sound fields, but it also reduces the sound intensity in some regions. To quantify effective listening area, indices of localized mean pressures and localized mean square error are introduced. Different traditional parameters like global root mean square error, which is most commonly used to investigate the quality of the reproduced sound field, are not suitable for measuring the error arises due to truncation effects or ripples in the reproduced sound field. New measuring parameters, waviness number and waviness score, are introduced to study this error and to make a valid comparison among the different reference curves and windowing functions. Numerical simulations are carried out for synthesis of an inclined plane wave and a point source. Three types of array geometries: (a) linear, (b) elliptical arc with inward concave and (c) elliptical arc with inward convex are chosen for the study. It is found that waviness number and waviness score are effective in quantifying smoothness of sound fields. Among various windowing functions, Tukey window (β = 0.75) is found to be optimal, which significantly reduces ripples in the reproduced sound field with relatively larger effective listening area. For an inclined plane wave, a reference line parallel to the plane wave front and for a virtual point source, a reference curve parallel to the circular wave front (in the listener plane) lead to relatively accurate (smooth) sound fields.

 

REFERENCES

 

  1. A. J. Berkhout, “Holographic approach to acoustic control,” AES J. Audio Eng. Soc., vol. 36, no. 12, 1988.

  2. E.N.G. Verheijen, “3D Sound Reproduction by Wave Field Synthesis,” PhD thesis, Delft University of Technology, Delft, The Netherlands, 1997.

  3. S. Spors, R. Rabenstein, and J. Ahrens, “The theory of wave field synthesis revisited,” in Audio Engineering Society - 124th Audio Engineering Society Convention 2008, 2008, vol. 1, no. January, pp. 413–431.

  4. E. W. Start, “Direct sound enhancement by wave field synthesis,” PhD thesis, Delft University of Technology, Delft, The Netherlands, 1997.

  5. S. Möller et al., “A cross-university massive open online course on communication acoustics,” J. Acoust. Soc. Am., vol. 141, no. 5, pp. 3556–3556, 2017, doi: 10.1121/1.4987538.

  6. X. Hu, J. Wang, W. Zhang, and L. Zhang, “Time-domain sound field reproduction with pressure and particle velocity jointly controlled,” Appl. Sci., vol. 11, no. 22, 2021, doi: 10.3390/app112210880.

 

Reply to Reviewer’s comments

 

Authors are thankful for giving us an opportunity to improve our submitted draft titled "A study of different tapering windowing functions and referencing curve for the improvement of sound field using wave-field synthesis". We appreciate the reviewer for giving her/his valuable time to read and provide us critical feedback to improve quality of the manuscript. We tried to answer all the reviewer's comments and revised the manuscript accordingly.

 

Comment # 1: Abstract and title state the objective of the paper clearly.

 

Reply: Thank you for this comment.

 

Comment # 2: The basic theories section is a bit rough, in the way it introduces the source selection term "a" by not showing it explicitly, and in my view its many subsubsections make it appear a bit more restless than useful for easy readability. Maybe remove the subsubsections and explain that "a" is binary and excludes sections with the wrong sign of X0s.X0ref, where waves leave the synthesis volume.

 

Reply: To make it easily readable, we improved (or restructured) basic theory section (Section 2) and expressed the meaning of terms used in the expressions. The source selection has been defined in the revised manuscript.

 

Revision: In the revised manuscript, following sentence has been added.

 

“The expression of source selection term is defined as, a(x0) = 1, ()  > 0 ; otherwise, a(x0) = 0.” 

 

Comment # 3: In general, it would be interesting to explain some of the terms, for instance the amplitude term B(w) of the plane wave, the secondary-source/integral coordinates X0, the receiver coordinate X, the source coordinate Xs(?), that these are cartesian position vectors

 

Reply: Thank you for this suggestion. In the revised manuscript, these terms have been defined (below Eq.(1) and above Eq.(8)).

 

Revision: In the revised manuscript, following lines have been added.

 

“𝑿 is the field point and 𝑿0 is the location of monopole sources at boundary and 𝑃3D(𝑿, 𝜔) is the field created in 3D half-space in front of the array.”

 

“Pressure field (𝑃Sw) created by a point source at 𝑿s can be written as:”

 

Comment # 4: K is the wave vector (unit vector times wave number, what is the wave number).

 

Reply: The bold 𝑲 represents the wave vector and non-bold italic 𝑘 represents the wave number, a scalar quantity.

 

Revision: We have defined the magnitude of the wave vector as, “𝑘 = |𝑲| = 𝜔 / 𝑐” below Eq. (4) in the revised manuscript.

 

Comment # 5: Xref is a position on a reference contour in the synthesis volume, where correct synthesis is demanded, that | Xref - X | denotes the euclidean distance. What is the term w(x0)i ?

 

Reply: The term 𝑊(𝑿0) represents different tapering windowing function.

 

Revision: In the revised manuscript, the term 𝑊(𝑿0) has been explained in section 2.2.1 and different tapering windowing functions have been presented in Figure 5.

 

Comment # 6: Does now a non-bold and non-capitalized position x0 denote the secondary source position?

 

Reply: Thank you for the comment.

 

In the revised manuscript, the x0 term is made bold and capitalized. It represents the position of the secondary source in the listener plane.

 

Revision: In the revised manuscript, w(x0)i is modified as 𝑊(𝑿0)n.

 

Comment # 7: The index m for the field point is sometimes bold, pls. check. in eq 8,9,10,... I assume it should be a non-bold index, everywhere.

 

Reply: Thank you for pointing out this mistake. In the revised manuscript, all subscripts ‘m’ are made non-bold.

 

Comment # 8: Fig. 4 could benefit from correct scaling of the diagrams in xy that does not distort the labels and title. Could it be vector graphics?

 

Reply: Thank you for this suggestion.

 

Revision: Figure 4 of the previous manuscript is replaced by Figure 7 in the revised manuscript. This is done to represent a better comparison among different windowing functions.

 

Comment # 9: In eqs. 15,16, the position vectors seem to have changed, again, to non-capitalized, non bold symbols (?).

 

Reply: Thank you for pointing out this error. In the revised manuscript 𝑥m is made small and non bold.

 

Comment # 10: And in eq. 16, indexing is done differently, and discretization of x is not made explicit by an index. .. pls unify.

 

Reply: Thank you for pointing out this error. In the revised manuscript, indexing of x is done in the same way as that of in Eq. (15).

 

Comment # 11: Fig.7 could also benefit from vector-graphics representation, if that is still possible. Otherwise, it is nearly impossible to read the legends of the diagrams in the given size.

 

Reply: Thank you for this suggestion. In the revised manuscript Fig. 7 changed to Fig. 6 and Fig. 8. And In fig. 6(a) and fig.8 vector-graph representation is done, but in figure 6(b) it is not done because it made the graph clumsier.

 

Comment # 12: Symbols in the figure captions are also diverging a bit from typeset used in the equations.

 

Reply: In the revised manuscript, authors tried to resolve this issue.

 

Comment # 13: This sentences probably requires re-formulation: "There is a slight improvement of SPL in the sweet area when the curve array with inward normal in the direction of the listener plane is used" - What does it mean? Isn't that somehow true for any reference point and any secondary source contour that exists?

 

Reply: In the revised manuscript, we have modified this and other related paragraphs to improve the readability.

 

Revision: The modified paragraph in the revised manuscript is as follows, “Figure 8 shows SPL at y=2 m along x-axis for a virtual point source at (-0.8, -0.6) reproduced using three different arrays using rectangular and Tukey (β =0.5) windowing function. Reference curve is a circular arc having a radius, 𝑟 = (max(|𝑿S − (𝑿0)A|, |𝑿S − (𝑿0)|)) + 0.3 where, (𝑿0)A and (𝑿0)B are the end points of the array in the negative and positive side of x-axis and 𝑿S is desired location of the virtual point source. It is observed from figure 8 that use of Tukey window for all the three cases of arrays leads to significant reduction in ripples and linear array provides the maximum listening area followed by an elliptical arc array with inward normal (Eup) and then, Edw.”

 

Comment # 14: The rest is rather clear from the numbers provided.

 

Reply: Thank you for this comment.

 

Other changes:

 

The expressions of LRMSEI and ARMSEI in the previous manuscript are removed in the updated manuscript. This is done because the pattern of intensity and pressure is coming similar to the variation of pressure for different window functions.

 


1 amritapuri@iitj.ac.in

2 akash.1@iitj.ac.in