Welcome to the new IOA website! Please reset your password to access your account.

Proceedings of the Institute of Acoustics

 

Comparison of SPLmax using AES75 and multitone distortion (MTD) methods

 

J. Hipperson, Funktion One Research

 

1 INTRODUCTION

 

The maximum sound pressure level (SPLmax) achievable from a loudspeaker is conceptually a useful figure, but in practice current measurement and reporting methods leave a significant degree of uncertainty in accuracy and comparability of figures. SPLmax typically ignores distortion and power compression, further reducing usefulness.

 

SPLmax is very loosely defined in IEC 60268-21, which permits the measurement stimulus, position and environment to be specified by the manufacturer, and only requires that the DUT (device under test) can reproduce a defined broadband stimulus for 100 hours without damage.

 

AES75 attempts to improve on this by considering power compression, and introduces a minimum allowable coherence into the SPLmax specification, which naturally includes distortion. The stimulus is M-Noise, a pink spectrum noise signal with increasing crest factor with frequency.

 

In reality, loudspeakers are non-linear well before power compression becomes significant. In this paper, SPLmax is measured using a total modulation distortion (TMD) threshold and compared to results using AES75 and IEC 60268-21 pink noise.

 

2 BACKGROUND

 

2.1 Characteristics of Music Signals

 

With some exceptions (i.e. Dubstep and Noise music), one characteristic common to all styles and genres of music is a low RMS level with high peaks; a high crest factor (ratio of peak to RMS) compared to random noise or simple waveforms such as a sine wave.

 

Looking at a probability density function (PDF) of a music signal (Figure 1), it is clear that music signals spend the most time at zero amplitude (due to zero crossings) and at low amplitudes. This is in contrast to a sine wave, which has almost the opposite behaviour, spending the most time at the amplitude extremes. Random and pseudo-random signals such as noise and multitones have a more music-like PDF.

 

 

Figure 1: Comparison of PDFs for various signals

 

Crest factor tends to increase with frequency in music signals. This fact has resulted in the development of M-Noise, a noise signal specifically designed to replicate this aspect of music signals. Looking at PDFs of low and high pass filtered frequency bands of the same music signal (with a logarithmic Y-axis), we can see this effect clearly, as the low pass filtered band spends much more time at higher amplitudes than the high pass filtered band.

 

 

Figure 2: Comparison of PDFs for low and high pass filtered music

 

The high crest factor of music works in our favour most of the time, as RMS signal level is proportional to heating in the voice coil, meaning that thermal dissipation is not a significant problem for correctly specified loudspeakers operating at reasonable levels. However, it is a more significant problem at low frequencies, where the crest factor is much lower and the loudspeaker therefore experiences greater heating power.

 

2.2 Limits of Electrodynamic (Moving Coil) Loudspeakers

 

Loudspeaker failure typically has two mechanisms; over-temperature and mechanical failure. In the over-temperature mechanism, the RMS level of the signal produces more heat than the voice coil can dissipate, ultimately resulting in the voice coil wire acting as a fuse.

 

In the mechanical failure mechanism, driven by a signal peak, the moving parts of the driver collide with some stationary part (i.e. coil former impacting the back plate of the motor, or the diaphragm colliding with the phase plug in a compression driver) resulting in disintegration.

 

To protect against these failure modes, loudspeakers typically have separate RMS and peak limiters. In a real application, limiters therefore set the highest possible SPLmax (rather than failure). SPLmax must lie somewhere between the onset of unacceptable distortion, and limiter activation (see figure 3). On the left side is the small signal region, where the loudspeaker can reasonably be assumed to behave as a linear time invariant (LTI) system. On the right side is the large signal region, where non-linearity becomes significant enough that it cannot be ignored. Traditionally, SPLmaxhas been specified as the maximum level without damage. But if the loudspeaker is severely distorted at this level, it would not make sense to specify a system for a venue on this basis. So what is the purpose of SPLmax using this definition?

 

A similar problem exists for amplifiers. Manufacturers could quote vastly inflated numbers for power output if the output signal bears little resemblance to the input signal. For the maximum rated power output to be useful to designers, amplifier power is typically quoted at 1% and 10% THD. Surely it makes sense to use a similar approach for loudspeakers?

 

 

Figure 3: Distortion vs. Amplitude for a loudspeaker, and possible location of SPLmax, somewhere between the onset of the large signal region and limiter activation.

 

 

Figure 4: Harmonic distortion (red) and modulation distortion (green and blue) (adapted from Klippel)

 

2.3 Harmonic Distortion

 

Harmonic distortion products, as thename implies, are harmonics of thefundamental i.e. 2f, 3f, 4f and so on.

 

Asymmetric non-linearity generateseven harmonics, and symmetric nonlinearity generates even and oddharmonics, following Fourier theory.

 

While still commonly used for measuringaudio electronics, harmonic distortion isnot a good choice for measuringloudspeakers, as the stimulus (single tone) is not representative of music signals (as we have seen from the PDFs), and harmonic distortion in loudspeakers is typically dominated by the second harmonic, which is perceptually tolerable, even at relatively high levels.

 

2.4 Modulation Distortion

 

Modulation distortion products are sum and difference frequencies of two modulating tones f1 and f2 i.e. f1+f2, f1- f2. Higher order modulation products result from harmonics i.e. 2f1+f2 and so on.

 

Modulation distortion is a better choice for evaluating loudspeaker distortion, as a sparse multitone stimulus is broadband, pseudo-random and has a higher crest factor than a sine wave. A multitone stimulus will elicit both harmonic and modulation distortion components from a non-linear system, whereas a single sine wave will only produce minimal modulation distortion between harmonic distortion components. Unlike harmonic distortion, modulation distortion has sum and difference frequencies that are not necessarily harmonically related. Strong harmonic and modulation distortion components will modulate each other, producing higher order modulation distortion components and so on, resulting in a dense “grass” of distortion products.

 

With this in mind, it makes sense that there is evidence that modulation distortion is perceptually more annoying than harmonic distortion. A method is explained in more detail in section 3.3.

 

2.5 Non-linearity in Electrodynamic Loudspeakers

 

There are numerous sources of distortion (non-linearity) in electrodynamic loudspeakers, listed in Table 1. All are functions of drive level (voice coil displacement and current) with the exception of cone breakup.

 

Non-linear coil resistance is more commonly known as power compression, as the increase in voice coil resistance with temperature reduces current, power and SPL.

 

Non-linearity

Function of:

Frequency range

Mechanism

Compliance (CMS)

Displacement (x)

Below Fs

Non-linear restoring force

Force Factor (Bl)

Displacement (x)

Current (i)

Velocity (u)

Low frequencies

(highest displacement)

Non-linear f=Bli

Non-linear damping e=Bli

Coil Inductance (LE)

Displacement (x)

Current (i)

HF modulation by LF High frequencies

Variation of inductance with coil position

Non-linear permeability of steel

Coil Resistance (RE)

Current (i)

All frequencies

Resistance increases with temperature

Young’s Modulus

Strain (ε)

Modal frequencies of cone

Stress as a non-linear function of strain

Modulation (IMD)

Displacement (x)

HF modulation by LF

Variable time shift in propagated sound producing modulation distortion

 

Table 1: Overview of non-linearities in electrodynamic loudspeakers (adapted from Beranek, Acoustics (2012)

 

3 METHODS

 

3.1 IEC 60268-21

 

SPLmax is defined by IEC 60268-21 as the maximum continuous SPL that can be reproduced for 100 hours without damage. The standard is quite loose, and there is a lot of scope for the manufacturer to change the measurement condition and stimulus. However there are some requirements:

  • The test signal represents typical program material in the final application
  • The same test signal is used for both the maximum input voltage test and SPLmax test

IEC 60268-21 suggests a test stimulus which is the “simulated programme signal” also known as clipped pink noise.

Figure 5: Simulated programme signal from IEC 60268-21

 

A flaw in this setup, is that the crest factor is specified before the amplifier.

 

In modern professional audio systems, the amplifier typically incorporates DSP filtering and limiters, which can significantly change the crest factor of the amplifier output signal.

 

However, in professional audio, IEC 60268-21 appears to be rarely used for SPLmax. Instead, most commonly, IEC 60268-1 pink noise (crest factor of 4) is used as a stimulus, and the peak SPL at 1m in free field conditions is quoted. Other pink noise stimuli are used by various manufacturers, with a range in crest factor from 6dB to 12dB, as seen in Table 2.

 

Table 2: survey of quoted SPLmax methods used by professional audio manufacturers.

 

Stimulus

Measurement conditions

Units

Pink noise (12dB crest factor)

Free field, 1m

dB peak

Pink noise (6dB crest factor)

Calculated at 1m

dB peak

M-Noise (AES75)

Free field, 4m scaled to 1m

dBZ peak and continuous

Not specified

Not specified

dB peak and continuous

Pink noise (12dB crest factor)

Free field, 1m, named preset

dB peak

Pink noise (10dB crest factor)

Calculated at 1m

dB peak

Not specified

1m

dB peak

Not specified

10% THD

dB peak

Pink noise (12dB crest factor)

Free field, 1m, named preset

dB peak

Not specified

Not specified

dB peak

“Band limited pink noise”

1m, amplifier clipping

dB peak

Pink noise (6dB crest factor)

1m, half space

dB peak

“85ms burst”

1m

dB peak

 

3.2 AES75

 

AES75 is a significant improvement over IEC 60268-21 as it introduces coherence and therefore considers power compression and distortion. The design of the methodology is intended for sound engineers and system technicians to be able to verify quoted SPLmax themselves using commonly accessible acoustic measurement equipment such as Real Time Analyser (RTA) software.


 

Figure 6: AES75 test schematic

 

The signal is M-Noise, a pink (1/f) spectrum noise signal with increasing crest factor with frequency, which more closely resembles the signal characteristics of music.

 

The signal level is increased until either:

  • The live measurement differs from the reference frequency response by at least 2dB over at least two octaves
  • The live measurement differs from the reference frequency response by 3dB anywhere
  • The coherence reduction target is reached (γ2 ≤ 91%).

 

Coherence is a feature of most RTA software, which provides the user some indication of the quality of the measurement in terms of signal to noise ratio. This naturally includes distortion products, and coherence will reduce as the magnitude of distortion products increases.

 

The function itself is:

 


 

Where Gxy(f) is the cross-spectral density between x and y, and Gxx(f) and Gyy(f) the auto spectral density of x and y respectively. This produces a value between 0 and 1, which is can be represented as a percentage or in decibels.

 

3.3 Total Modulation Distortion (TMD) Threshold

 

There is some evidence in literature that modulation distortion is the most perceptually annoying or objectionable form of distortion in loudspeakers. Additionally, music is more similar to a multitone stimulus than a single sine wave or random noise. Therefore it makes sense to focus attention on a loudspeaker’s response to a multitone stimulus, and the level of modulation distortion it produces.

 

This method is based on the multi-tone measurement method in IEC 60268-21 to separate the distortion products from the multitone stimulus in a loudspeaker measurement. In this example, this is achieved with cascaded notch filters, although there are other methods.

 

 

Figure 7: Separation of IMD and fundamental (adapted from Klippel and IEC 60268-21)

 

The method above is extended to measure the RMS level of distortion products relative to the RMS level of the multi-tone stimulus, to produce a total intermodulation distortion level (TMD) expressible in decibels or a percentage:

 

Where Distortion is an array containing frequency bins of the separated IMD spectrum, and Fundamental is an array containing the separated multitone stimulus/fundamental. The level is increased until the RMS level of the distortion products reaches an arbitrary threshold relative to the RMS level of the stimulus, i.e. 5%, to determine SPLmax.

 

The level of distortion products can also be plotted against input voltage or SPL to visualize the linearity of the transducer.

 

A very similar method is used by German magazine Production Partner in their loudspeaker reviews, although distortion level is quoted in dB and the threshold varies somewhat between different reviews, but is typically chosen to be around -25 to -20dB. A method showing the relative mutitone distortion in dB is also available in the Klippel dB Lab; the Multitone Measurement (MTON) module.

 

Further work is required to arrive at a mean threshold for perceptual annoyance from IMD in loudspeakers. The method could then be implemented in a future standard for SPLmax.

 

4 RESULTS

 

Two different loudspeaker design approaches were tested for each method, a “conventional” design with a 12”+1.4” coaxial driver, and a wide bandwidth, low distortion 10” midrange and 1.4” exit compression driver design.

 

In all cases the measurement microphone was a B&K 4007, with a maximum linear SPL of 155dB.

 

Both loudspeaker designs utilized active crossovers, as passive crossovers can be additional sources of distortion.

 

The crossover point for the 12” coaxial loudspeaker was 1.2kHz.

 

The crossover point for the 10” midrange loudspeaker was 4kHz.

 

4.1 Pseudo-IEC 60268-21 (12dB Crest Factor Pink Noise)

 

This method was performed using the Clio electroacoustic measurement system. The method was modified to be more similar to the unofficial industry “standard” method of using 12dB crest factor pink noise, turning the loudspeaker up as loud as it will go, and quoting the highest measured peak SPL value. The continuous SPL is also quoted, as this is what the standard actually specifies.

 

Loudspeaker

SPLmax (continuous)

SPLmax (peak)

Conventional loudspeaker

120 dB

132 dB

Wide bandwidth midrange

122 dB

134 dB

 

4.2 AES75

 

This method was performing using Smaart 8 RTA software with the M-Noise signal. Quoted input level is at the balanced analogue input of the digital active crossover. The amplifier gain was 32dB.

 

Loudspeaker

RMS Input level

SPLmax dBZ (continuous)

SPLmax dBZ (peak)

SPLmax dBA

(continuous)

Conventional loudspeaker

+3.3 dBu

118 dB

132 dB

114 dB

Wide bandwidth midrange

+3.3 dBu

122 dB

137 dB

121 dB

 

4.3 Total Intermodulation Distortion (TMD) Threshold

 

This method was performed using a program written by the author.

 

The program is SPL calibrated with an automatic routine using a microphone calibrator. The crest factor of the multitone signal was ~4. (4.55).

 

Further work is needed to establish perceptual thresholds for TMD, but for the purposes of this test, 1% was deemed to be “probably not very audible” and 5% was “very audible”.

 

Loudspeaker

SPLmax at

1% TMD

(continuous)

SPLmax at 1% TMD

(peak)

SPLmax at

5% TMD

(continuous)

SPLmax at

5% TMD

(peak)

Conventional loudspeaker

100.8 dB

112.6 dB

120.6 dB

132.5 dB

Wide bandwidth midrange

112.7 dB

124.6 dB

125.7 dB *

138.4 dB *

 

*4.1% TMD, as 5% could not be reached without possible loudspeaker damage An example of the output of the program is shown below:

 

Noise Floor = 51.7 dB SPL

SPLmax (continuous) = 100.8 dB SPL

SPLmax (peak) = 112.6 dB SPL

SPLIMD = 61.0 dB SPL

TMD = 1.01 %

 

Figure 8 shows the graphical output of the program.

 

From left to right, FFT of the measured data, followed by the filtered distortion signal with multitone stimulus removed. And finally the FFT of the noise floor immediately preceding the measurement, for reference.

 

 

Figure 8: Example of graphical output of TMD program

 

5 CONCLUSIONS

 

5.1 Pseudo-IEC 60268-21 (12dB Crest Factor Pink Noise)

 

The advantage of this method is that it is relatively simple, requiring only a pink noise source and an SPL meter with a max hold function on the peak reading. Another advantage is that several manufacturers appear to have converged on a common method of quoting peak SPL in free field conditions at 1m, with 12dB crest factor pink noise. While it is not a very good method, it at least allows users to compare loudspeakers if the methods are the same.

 

The disadvantages are numerous. The SPLmax typically quoted by manufacturers is misleading, as while an IEC 60268-21 pink noise signal may be used, the peak SPL is quoted instead of the continuous SPL, and it is very dubious whether loudspeakers would be able to sustain the peak SPL quoted for several minutes, let alone 100 hours, as a result of power compression. (This fact has seemingly spurred the development of AES75). Some manufacturers are more honest, and describe this signal as a “noise burst”. Another misleading aspect of peak SPL relates to line array loudspeakers, which often have multiple HF drivers capable of peak SPL output far in excess of other parts of the system (MF or LF drivers), which gives an unbalanced picture of the loudspeaker performance.

 

Perhaps most importantly, this method does not consider distortion. As can be seen from the comparative TMD measurements in 4.3, all loudspeakers have very audible and objectionable levels of distortion when driven to the peak SPL levels typically quoted by manufacturers (130-150dB peak).

 

If a system was specified based on quoted SPLmax and expected to be normally operated at these levels, the quality of audio would be extremely poor, and the loudspeakers would be unlikely to last very long. A criticism of this hypothetical scenario might be “don’t specify systems based on SPLmax”, but this raises the counterpoint of “what is SPLmax for?”

 

SPLmax must be more meaningful for users, which leads us to AES75.

 

5.2 AES75

 

AES75 has been designed for sound engineers to be able to validate SPLmax themselves, using commonly available RTA software and equipment, which is an admirable approach.

 

The primary advantage is that it does consider power compression, which leads to a far more realistic SPLmax that in theory, should be able to be sustained by the loudspeaker indefinitely. Evaluating power compression by the change in transfer function appeared to be quite effective. This closely tracked the rise in impedance measured via differential and current probes. Unfortunately, distortion did not appear to significantly affect the coherence of the measurement in a clear and reliable way, making it difficult to determine how much distortion was being produced. Another disadvantage is that the method is very time consuming. The standard specifies that the level must only be increased at a maximum rate of 1dB per minute. It is also very important to stop the measurement immediately once the power compression reaches 2-3dB as with most loudspeakers the power compression increases very rapidly beyond the initial 2dB if the level is increased further, which will lead to failure.

 

In terms of comparative levels, the continuous SPLmax established by AES75 appears to be slightly lower than the simple pink noise method. Depending on where limiters are set, more metrologically rigorous manufacturers would probably find that their continuous 12dB crest factor pink noise measurement is quite similar to the AES75-derived value.

 

5.3 Total Intermodulation Distortion (TMD) Threshold

 

The TMD method takes the concept of AES75 much further, and is focused entirely on distortion, as loudspeakers are severely non-linear before power compression becomes a significant problem. Ultimately, loudspeakers are designed for humans to listen to, and this fact should be considered in a specification for SPLmax. With a listener-centric approach, there is no point quoting an SPLmax if the loudspeaker is unpleasant or even unlistenable at this level.

 

With this in mind, the TMD method quotes continuous and peak SPLmax at 1% and 5% TMD (total modulation distortion). These limits were chosen somewhat arbitrarily as 1% was not very audible, and 5% was very audible, and subjectively rather unpleasant. Further work is needed to establish perceptual thresholds for TMD. This could be achieved by loudspeaker modelling, or playing signals from a real loudspeaker at safe levels to listeners via headphones.

 

As might be expected, the SPLmax values at 1% TMD are substantially lower than the values produced by pink noise or AES75 methods. This varies with the loudspeaker design significantly, as the 1% TMD SPLmax of the conventional 2-way loudspeaker was 18dB lower than the AES75 value. For the wide-bandwidth midrange design, the 1% TMD SPLmax was only 9dB lower than the AES75 method. Additionally, the 1% TMD SPLmax of the wide bandwidth midrange design was 12dB higher than the 1% TMD SPLmax of the conventional loudspeaker, demonstrating the value of the low distortion design approach.

 

The differences between loudspeaker design and measurement method reduce substantially at the 5% TMD level, suggesting that all loudspeakers are severely non-linear at these levels. Although notably the wide-bandwidth midrange design produced a higher peak SPLmax at 4.1% TMD than in any other test.

 

An advantage of the TMD method is that it is very quick, and can be fully automated once the test is set up and calibrated. A single measurement takes a few seconds. The complete automated test can be performed in under a minute. Software such as Klippel dB Lab and WinMF have existing capabilities for measuring total modulation distortion in dB relative to the fundamental.

 

However, the TMD method could be improved in a number of ways. One disadvantage of the method as used in this paper is that the multitone signal used is equal amplitude for each sinusoid, which results in more high frequency energy than a typical music signal. This will over-estimate distortion (and under-estimate SPLmax). This is resolved by simply adjusting the levels of each sinusoid.

 

5.4 Final Conclusions

 

In the current state of affairs, sound engineers and designers must navigate misleading and incomparable figures from manufacturers, leading to confusion and in some cases, poor performing systems and bad audience experiences.

 

IEC 60268-21 is typically not correctly used in professional audio. Instead 12dB crest factor pink noise is used as a stimulus and the peak SPL at 1m in free field conditions is quoted, rarely with any further qualifying information. Loudspeakers are unlikely to be able to sustain the levels quoted due to power compression, and will be unacceptably distorted.

 

AES75 is a substantial step in the right direction by considering power compression. This means that in theory, the loudspeaker should be able to produce the quoted SPLmax indefinitely. However, AES75 does not make distortion clearly visible or quantifiable, showing only a small reduction in coherence.

 

This led to the development of the TMD method, which specifies SPLmax based entirely on distortion thresholds. The result is two sets of SPLmax values at 1% (reasonable use) and 5% (absolute maximum). This gives users a much more useful picture of loudspeaker performance, with the ultimate goal of improving audio quality and audience experiences.

 

6 REFERENCES

 

  1. R Schwenke, M Van Veen, “Determining The Source of Coherence Reduction Using Playback Level of M-Noise”, Institute of Acoustics, Reproduced Sound 2020
  2. M. van Veen, and R. Schwenke, "Coherence as an Indicator of Distortion for Wide-Band Audio Signals such as M-Noise and Music," Engineering Brief 559, (2019 October.)
  3. PA. W.. Klipsch, "Modulation Distortion in Loudspeakers," J. Audio Eng. Soc., vol. 17, no. 2, pp. 194, 196, 198, 200, 202, 204, 206, (1969 April.).
  4. EA. R.. Geddes, and LI. W.. Lee, "Auditory Perception of Nonlinear Distortion," Paper 5891, (2003 October.).
  5. A. Voishvillo, "Assessment of Nonlinearity in Transducers and Sound Systems – From THD to Perceptual Models," Paper 6910, (2006 October.).
  6. A. Voishvillo, "Measurements and Perception of Nonlinear Distortion—Comparing Numbers and Sound Quality," Paper 7174, (2007 October.).
  7. E. Czerwinski, A. Voishvillo, S. Alexandrov, and A. Terekhov, "Multitone Testing of Sound System Components 'Some Results and Conclusions, Part 1: History and Theory," J. Audio Eng. Soc., vol. 49, no. 11, pp. 1011-1048, (2001 November.).
  8. E. Czerwinski, A. Voishvillo, S. Alexandrov, and A. Terekhov, "Multitone Testing of Sound System Components 'Some Results and Conclusions, Part 2: Modeling and Application," J. Audio Eng. Soc., vol. 49, no. 12, pp. 1181-1192, (2001 December.).
  9. https://www.klippel.de/fileadmin/klippel/Files/Know_How/Webinar/IEC%2A21/EN%20H andout/KLIPPEL%20LIVE%20Series1_Part5_Maximum%20SPL%20%E2%80%93%20 Giving%20this%20Value%20Meaning.pdf