|
5.1 Surround Sound
Compatibility Within HD Radio TM
and The Existing FM-Stereo Environment
Frank Foti
Omnia Audio
May, 2005
Preface
Surround is the Killer App for HDFM broadcasting. Based upon
the tremendous response at NAB2005, it is clear that more and
more people are becoming convinced that FM radio has the
potential to take a major step forward with this tech.
As of this
writing, there are four proposed methods for surround
broadcasting. They can be divided into two categories: the
three matrix systems and the MPEG Spatial system. This paper
will point out some critical technical issues that
broadcasters must be aware of as they consider surround, lest
they degrade and damage their FM-Stereo service.
FM-Stereo, Revisited
First, a short review of the FM-Stereo multiplex transmission
system, or mpx for short. This system has been operating
successfully since 1961. The FM-Stereo system was designed to
be compatible with the mono-only system it followed. The Left
and Right audio channels are summed to create a mono signal
that is received on mono radios. The two channels are also
subtracted to create a stereo-difference signal, which is
broadcast along with the mono sum signal. From these two
signals, the original Left and Right audio channels are
recovered: adding them gives you the Left and subtracting them
gives you the Right. This is expressed in the algebraic
equations: 2L = (L+R) + (L-R) and 2R = (L+R) – (L-R).
The L+R
signal resides within a 15kHz range between DC – 15Khz. The
L-R signal is added using double sideband suppressed carrier
modulation (DSBSC). The 30kHz frequency range is due to the
double sidebands, and the carrier is centered at 38kHz. This
results in spectrum occupancy between 23kHz – 53kHz. The 38kHz
suppressed carrier is created by a 2X multiplication of a
19kHz pilot tone. The pilot is also used for signaling a
receiver that a stereo broadcast is present. The block diagram
below illustrates the spectrum (after FM demodulation) of the
FM-Stereo multiplex system:

The L-R
component is most sensitive to transmission impairments. As
you’ve probably experienced, multipath can very annoyingly
disrupt stereo performance. Whenever L-R content exists in the
FM-Stereo transmission, spectra is generated in the 23kHz –
53kHz range. This frequency range also impacts the modulation
index, which correlates to the number of RF sideband pairs
created by the frequency modulation process. As modulation
frequencies and level increase, more sideband pairs are
generated, and as such the modulation index value increases.
The fundamental technical reason why increasing L-R increases
multipath is that the modulation index of the stereo
subchannel (23kHz -53kHz) is much lower than the modulation
index of the main channel. Multipath rejection in an FM system
is a function of the receiver's capture ratio, which improves
as modulation index increases.
Multipath
occurs when the FM signal arrives at the receiver via multiple
paths - hence the name. FM transmission is line-of-sight, but
the signal will bounce off other objects, such as buildings or
other tall objects, and this creates the numerous paths and
subsequent distortions. A signal following a reflected path
will arrive a bit delayed compared to the direct signal, as it
will have traveled a farther distance. Here is where some
basic physics comes into play. The reception problems caused
by multipath result from the vector variances of the RF paths,
and not the audio signals contained within the FM-Stereo
system. On account of this, the delayed signal, upon entering
the receiver, create cancellations and attenuations due to
phase/time-delay.
The L-R part
of the mpx signal is extremely sensitive to any form of
disruption created by multipath. When delayed instances of the
L-R signal are present, the stereo decoder in a receiver
becomes confused because it will not know whether to decode
the original or the delayed versions of the signal. The
resulting stereo sound field is destroyed and there is the
audible distortion we’ve all heard.
The L-R
modulation level is critical. Increased L-R level generates
additional mpx sub-channel spectrum in the 23kHz – 53kHz
domain, and this range of frequencies becomes very fragile
during instances of multipath. This is the main reason why
manipulation of the L-R signal for stereo enhancement has been
problem-prone. Most algorithms used for stereo enhancement
generate additional L-R level and this exaggerates the
irritating audible effects of multipath.
This is all
quite real. These are not theoretical speculations, but rather
real-world on-air experiences.
It is worth
pointing out that modulation levels in FM transmission are
governed by audio dynamics processing. Normal processing
yields enough RMS modulation in the L-R signal without
exaggerating multipath. Aggressive processing does have the
potential to pass the threshold into multipath distortion. But
the usual Left/Right processing does not alter the ratio of
L-R to L+R, which is the primary cause of multipath-related
problems.
Matrix
Surround Methods and Multipath
Now to our
central topic. Methods that use matrixing for surround are not
new. Most of the quad systems of the 1970s employed some form
of matrix method. The difference today is that digital
implementations offer more flexibility for the encode/decode
process. They are capable of moderate surround performance on
some types of content, but lack consistency with regard to all
content.
A matrix
surround encoder can accept a 5.1 multichannel input and
produce a stereo Left-total (Lt) and Right-total (Rt) output.
The Lt and Rt are a downmix of the surround channels with
embedded position cues. This process is based on phase and
level changes applied to the multichannel audio signals. Each
matrix system applies this technique a bit differently, but
the key element is that each of them do alter the phase
relationships between the channels as a means to identify the
individual 5.1 channels within the 2-channel stereo signal.
Due to the
phase modifications, the resulting stereo downmix now contains
altered levels within the L+R and L-R signals. In many cases
the L-R RMS level is significantly increased when compared to
the artistic stereo mix (the independent stereo mix offered by
the artist/producer) of the same content. When
matrix-generated downmixes are broadcast in FM-Stereo, L-R
modulation level is significantly increased. And as we’ve
seen, increased L-R modulation makes for increased multipath!
Proof
Positive
The
following X-Y plots were taken of music content that
illustrates the exaggeration of the L-R signal by a matrix
method. The well-known song “Wouldn’t It Be Nice” by The Beach
Boys was used for the demonstration. A recently released
stereo mix from the CD “Pet Sounds” was used for the artistic
stereo version. The DVD-Audio disc of “Pet Sounds” contains a
5.1 Surround mix, which was encoded through a Neural Audio
5225 downmix unit. (This particular unit was provided to me
personally by the CEO of Neural Audio. Presumably, there is no
defect in the unit’s performance.)
These X-Y
plots were gathered using a digital oscilloscope that has
storage capability. The Left channel is connected to the (X)
input and the Right channel to the (Y) input. The scope is set
to measure the two signals in X-Y mode. This yields a pattern
that is commonly used to measure phase differences between two
signals. An in-phase signal will show a straight line at 45
degrees, the top to the right. Likewise, an out-of-phase
signal yields a straight line at minus 45 degrees, the top to
the left.
The
following illustrates the test setup:

The first
plot is of the artistic stereo mix:

X-Y Plot: “Wouldn’t It Be Nice” (Artistic Stereo)
This signal
appears normal. The content is predominantly in the L+R
domain, with a moderate amount of L-R. Now, here is the same
segment of the song that was downmixed via the Neural 5225
Surround encoder:

X-Y Plot: “Wouldn’t It Be Nice” (Neural 5225 Downmixed
Stereo)
Not only is
this significantly different, but the extreme level of L-R
content indicates that this will also sound noticeably quieter
in mono as the amount of 180 degree out of phase information
is very high. This would generate added multipath, and to make
matters worse, mono is compromised so much that the perceived
mono audio level is down by 3dB or more, a huge loudness loss!
The Beach Boys’ piece is not an isolated case; most music
material we tested showed significant L-R increase when
downmixed with this matrix encoder.
While this
points up a significant problem specifically in the Neural
system, chances are that the other matrix methods will have
similar issues. Probably the Neural algorithm could be
modified to reduce L-R level, but then there would be poorer
surround performance.
Multipath
aside, there is often quite serious degradation of the stereo
for other reasons inherent to the matrixing schemes: 1) The
5.1 mix must be mechanically downmixed to 2 channels.
Producers make 5.1 versions without regard for downmix and
some music does not well survive this process. 2) Matrixing
requires phase shifts between the front and rear channels as
they are combined to make the “compatible” stereo output. This
dulls aural impact and often just sounds “weird.” All matrix
methods compromise surround and stereo/mono performance in
some fashion. There is no free lunch with the matrix
methodology. Even the folks at Sansui now admit to that!
Conclusion
It is vital
that any proposed method for surround broadcasting not
compromise the transmission performance of stereo or mono.
Likewise, there must not be anomalies that exaggerate
multipath. The matrix systems fail in both of these areas.
Contrast
matrix to the superior performance of the MPEG Spatial System
. Taking full advantage of HD Radio’s capabilities, the
surround spatial information is transmitted in a separate
digital side channel and the original artistic stereo version
is broadcast without modification. There is absolutely zero
aggravation of multipath because the L-R level is unchanged.
Further, MPEG Spatial offers uncompromised surround sound with
full separation.
If you are
to believe the matrix proponents, their method is a simple
solution that can be readily integrated into existing
airchains. But there are real problems. To date, there is not
a single mainstream station using a matrix system for routine
programming. The few matrix broadcasts have been special
demonstration program segments on public stations. The tests
have been with classical or jazz concerts, which are produced
with mostly ambience audio in the surround channels. Pop music
production is very different, with elements dramatically
positioned around the listener – a case much more difficult
for matrix to handle. Let’s see how a matrix system performs
on this kind of music being aired on an aggressively processed
CHR station in New York City or Los Angeles! Until that
happens, take the matrix claims with a big grain of salt.
The Reference Model Architecture
for MPEG Spatial Audio Coding. J. Herre, H. Purnhagen, J.
Breebaart, C. Faller, S. Disch, K. Kjörling, E. Schuijers, J.
Hilpert1, F. Myburg; 118th AES Convention, Barcelona, Spain
May, 2005.
Top
|