Return to Omnia hompepage

Digital Dynamics Processing: It's All In The "Samples!"

By Frank Foti
November, 2000

Introduction
One of the most emotional topics for discussion in broadcasting is audio processing. Why? Well, it's dynamic in so many ways! Not just in the actual level adjusting action of the device itself, but the plethora of viewpoints, opinions and philosophies about this combination of technology and "black magic". I must admit that as a former chief engineer and now designer of this stuff, I too can easily become embroiled in a good chat about processing. From the Telos/Omnia R&D offices via the Internet, I've shared many a thought about this exciting "personality enhancement" to many radio stations. So it stands to reason that the topic we are about to visit will certainly draw a few comments from the many passionate folks who study, design, adjust, and work with audio processing.

This paper is an outcome from our recent development work on our newest processor, the Omnia-6fm. No, this is not a sales related piece about all of the new "doo-dads" and "whatcha-ma-call-it" functions. It's a short essay on the key element of dynamics processing…sampling. We have found that this sole function, in digital audio, can be the basis of difference between a merely acceptable design and an excellent one. If I may modestly mention here, this is probably one of the elements that has led to the world-leading success of our Omnia.fm product. I'll say nothing more on that, as I'm sure you know the story by now. But this issue of sampling came to light for us again in this latest development phase, and I'm sharing some of that new information with all of you here in this forum.

Sampling Rate: A Fast Review…
If you’re experienced with digital audio, I suggest you skip this section. If not, please read on. In any digital audio system, the basis for audio data acquisition and transportation is related to something known as sampling rate. Simply defined, this is a function by which the audio signal is sampled and measured very quickly at a specified rate. This is usually referred to in the form of a frequency, or the rate of the samples as they are acquired. Hence the term sampling rate.

Now, for audio signals, it has been determined that to reliably capture the entire audio spectrum so as to reproduce it with low distortion, the minimum sampling rate should be at least twice the rate of the highest known frequency. By example, our hearing range is known to be 20Hz – 20kHz. So the lowest sampling rate for full range audio should be at least 40kHz. This (2x) of the highest audio signal is known as the Nyquist frequency. The compact disc audio player employs a sampling rate of 44.1kHz. In the professional audio world, this is the lowest rate used to obtain full range audio. Most recording studios employ 48kHz or even 96kHz for recording and mastering purposes.

This essay is not written to debate the merits of which rate is optimum for pure aural purposes. There are many excellent papers on the topic of higher-sampling rates for improved sonic performance that may interest the reader, and we have listed several on our website. This paper rather, will focus on how sampling rate affects the operation of a dynamics processor. This issue of sampling rate is, in fact, very influential to other functions besides the audio signal itself, which will become more apparent soon!

Aliasing Distortion: The Enemy Of Digital Audio
Within the digital audio system the frequency spectrum must be contained to precisely one-half of the Nyquist frequency. Why is this important? Since the sampling rate defines the limited range of audio spectrum, any signal that tries to exceed this range will run out of spectrum! When this occurs, the excess frequency will reflect, and travel back down into the audio spectrum as a distortion component. This is known as aliasing distortion. For instance, in a 48kHz sampled system, any signal that tries to exceed 24kHz will reflect itself back down into the audio domain.

Example: Let’s say that in our 48kHz system, some form of audio harmonic of, say, 15kHz is present in the source (such as the second harmonic of 30kHz). Since 30KkHz is 6kHz beyond the system’s reproducible maximum of 24kHz, aliasing product at 18kHz is generated from the 6kHz of reflection that occurs due to lack of spectrum. This is proven by subtracting the difference between the desired 30kHz signal and the 24kHz Nyquist frequency, which results in a "leftover" 6kHz. Further subtracting 6kHz from 24kHz yields the aliasing component at 18kHz.

Generally speaking, normal digital audio systems like CD players and DAT machines have nothing to worry about. But any digital system that has the potential to create harmonic energy within the system must be careful of creating aliasing distortion. This is possible within a dynamics processor, as there are numerous functions (hard limiters, for instance) that generate signal spectrum which will try to exceed the Nyquist frequency. Just as a clipper creates harmonic products in an analog design, the same holds true in the digital domain. Even if a low pass filter is applied, it will not suppress harmonics generated within the digital system which try to exceed the Nyquist limitation. Our previous design efforts have proved that we are the only audio processor manufacturer to observe this, and to take extra pains in designing our digital processing system so that added processing distortions were not generated in the digital domain.

This is the basis of this discussion.

Digital Processors: More Than Just Clean Clipping!
The record speaks for itself about how our products perform with respect to the final limiting ("clipping") function. Through musical program, as well as dynamic technical testing, we have proven our theories correct with respect to the design of a non-aliasing digital hard limiter/clipper. For us this is old news. What is newsworthy is our detection of what occurs in dynamics control when processing generated distortion. Our analysis of such is at the root of this discussion (and our position on which has been supported by another digital audio researcher as well [1]).

To better understand this phase of the topic, we must look at how a dynamics controller such as a compressor or limiter operates. Simply stated, the audio signal is passed though a gain function block that will change or manipulate the gain by means of a control signal. It is the action of this control signal that we are most concerned about. Following is a simplified block diagram of a feed-forward peak limiter:

The audio signal appears at the input node x(n). It is passed through to a delay block (DEL-1) and then onto one side of a multiply function where it appears as a result of x(n-D1). Additionally a control signal is created in the feed-forward section that appears as g(n) and is attached to the other portion of the multiply function, thereby creating a dynamic gain control where the audio signal x(n-D1) is multiplied by g(n) which yields the resultant level as represented as y(n). For our discussion here, the critical signal is the control node which is illustrated as g(n).

It is this signal that determines not only the output level of the dynamic function, but also the aural texture of how the function block will sound. To relate this concept to an analog design, think of it as the operation of a four-quadrant multiplier component, or VCA (voltage controlled amplifier). As anyone who has designed a gain controller will tell you, this signal must be generated with care or additional processing related distortions and artifacts will occur. Naturally there will be some sonic tradeoffs, as this function of the multiplier is to affect the dynamic gain function. But if the control signal reacts too abruptly, or not in a smooth fashion, IMD (Intermodulation Distortion) will be easily heard in the signal. This is what is commonly described as a busy or squashed sound.

Within the digital design we must be careful of all these parameters, plus one additional component: our old nemesis aliasing distortion. How can this occur? The answer lies in the characteristics of control signal g(n). To understand the aspects of this, we must first consider what can transpire in this signal, especially when employed as a dynamic peak limiter.

The control signal is comprised of functions that must respond to the audio envelope. For a peak limiter, there are qualities of the algorithm where the initial transient response of a short duration peak signal are either recognized or ignored. In the event that the limiter algorithm reacts to the transient information, it must do so in a manner that does not try to respond faster than the sampling rate of the limiter. This is very critical to the upper third portion of the audio spectrum; basically, above 5kHz for FM Stereo transmission. Transient audio signals in this range can respond very quickly. Should they trigger action in a peak control algorithm that is faster than the Nyquist frequency of the sampling rate, aliasing components will result in the control signal g(n).

While most limiters are designed to operate with a normalized set of time constants that are well within the sampling rate of the system, most broadcast processors employ some added multiple timing functions that are not user controlled. They are designed to operate on the short duration peaks, and reduce them in a sub-audible manner. But should the action of these added time constants result in response that approaches the Nyquist frequency of the system, aliasing distortion will result.

Due to the complex non-linearity of these signals, they are not easily measured. To do so would take careful analysis with predictable tone bursts of high frequency signals, and a very fast storage scope and audio analyzer. They are actually much easier to model in a computerized synthesis of the algorithm. Suffice to say that any additional high-speed dynamics components which try to become part of the control signal will yield aliasing, if the sampling rate is not large enough. Our research has shown conclusively that some aliasing will result even in a 48kHz sampled system; therefore any system which employs sampling at a lower rate – say 32kHz sampling for dynamics control – is bound to cause aliasing, even in the peak limiters. Any designer who contends that sampling rate for high frequency control does not matter has failed to grasp this uniquely important aspect of processing. In listening tests comparing low-sampling rate controlled processes with higher rates, listeners unanimously choose the higher rates. The difference is very obvious, even in casual listening.

Sampling Rate Matters A Great Deal!!
Our research has shown that to properly generate a dynamics limiter control signal void of any digital artifacts, the sampling rate should be a minimum of 96kHz. This is three times the rate of the 32kHz sampled systems known to cause aliasing distortion.

Realize also that the section of audio processing we are discussing is the area before the final limiter/clipper. Should aliasing distortion be generated and added to the audio signal, it is impossible to remove. Therefore, it does not matter if the final limiter employs a very large sampling rate, for if the signal prior to final limiting has been contaminated with aliasing artifacts, damage in the final stage is unavoidable and irreversible.

The Omnia-6fm avoids any possible contamination by employing 96kHz sampling in the upper bands of the limiting section. That, coupled with our proprietary non-aliasing clipper method, allows clean, smooth and crisp reproduction of presence and high frequencies. This is quite critical for managing the steep boost of the pre-emphasis curve (where 15kHz is amplified by 17dB for 75µs in North America, and 50µs in the rest of the world).

Conclusion
With a new generation of digital processors coming to market, there will no doubt be much discussion about sampling rate once again. Some will tell you that using a lower sampling rate conserves machine cycles, then turn around and waste these same saved cycles needlessly in an over-sampled clipper. In meetings with a number of the Telos/Omnia DSP engineers, we examined this approach most carefully to ensure that we were not missing anything. Our unanimous conclusion is that the proponents of limited sampling rate control processors – not us! – have indeed missed something. Something very important to the sound of audio processing.

As a designer, it doesn’t really matter how we employ the available DSP cycles. As long as we provide you with a great sounding product, and at a decent cost, who cares how many cycles or DSP chips we use? Did it matter how many transistors were in your first portable radio all those years ago? No, it didn’t! All that mattered was the way it sounded. That axiom held true during the early days of transistor radios, and it’s still true today with DSP technology.

Anyone who claims that 32kHz sampling is sufficient for FM Stereo dynamics audio processing is way behind the times. Sampling rate matters. It’s the basis of our digital audio spectrum, and the more samples we have, the better the resulting performance. This issue rings true again and again as research into digital audio strives to raise the bar. And it’s no different in the dynamics domain.

Happy listening…

Frank Foti

REFERENCES

1 Mapes-Riordan, D.: A Worst-Case Analysis for Analog-Quality (Alias-Free) Digital Dynamics Processing, 105th Convention of the Audio Engineering Society (AES), San Francisco 1998, Preprint 4766 

Top