With high-resolution formats (those with sample sizes larger than 16 bits, sample rates higher than 48 kHz, or both), the audio industry hopes it's found the secret that'll get consumers buying again. The bigger numbers sound alluring: 24-bit samples, 192-kHz and 2.822-MHz sampling rates. But, as the PC industry has lately discovered, there's an upper-end price threshold beyond which, especially in the absence of compelling applications, customers aren't willing to go.
Anticipated customer acceptance and the price at which you'll achieve that acceptance are important issues to consider when designing your next audio-augmented system. Which of the format contenders have the greatest chance for commercial success and why?

Pacific Microsonics was perhaps the first company to develop a high-resolution audio technology that achieved broad industry adoption. HDCD (High Density Compatible Digital) audio builds on industry-standard audio-CD and -DVD foundations (
Figure 1). By embedding control information within the altered LSBs (least significant bits) of, on average, less than 5% of the formats' stored audio samples, the developers claim that HDCD can dramatically expand the audio's dynamic range if the playback device contains an HDCD-aware decoder. Otherwise, the altered LSBs take the form of uncorrelated noise, analogous to dither and are effectively inaudible.
The substance behind HDCD's "20-bit" marketing sizzle takes three primary forms. First, HDCD encoding dynamically selects, as music characteristics vary over time among four anti-alias FIR (finite-impulse-response) filters with different characteristics between 16 and 22 kHz. The intent, in balancing frequency rejection and transient response, is to make the filter act as sonically neutral as possible over a wide range of possible audio conditions. Some of the LSB control bits drive a corresponding configurable filter array at the HDCD decoder.
When considering the DVD-Video format as a high-resolution audio-storage platform, some people forget that they don't necessarily need to resort to lossy-compression schemes, such as Dolby Digital or DTS (Digital Theater Systems). It's possible to shoehorn at least two, and as many as six, high-resolution audio channels within DVD-Video's 6.144-Mbps maximum-allowable audio bit rate (
Table 1).
Supersonic tonic The DTS bit-stream format was, from the very beginning, designed for future extensibility, with corresponding backward compatibility to prior-generation decoders (Figure 2). With DVD-Audio, the entire 9.6-Mbps bit stream is available, if desired, for audio information. This higher bit rate enables the media to store two channels of 24-bit, 192-kHz uncompressed audio. The bit rate is not, however, high enough to enable the uncompressed transfer of six channels' worth of 24-bit, 192-kHz or, for that matter, even 96-kHz surround audio. Numerous ways to get around this limitation exist. You could reduce the sampling rate or sample size of all channels. Thanks to the flexibility built into the DVD-Audio specification, you can also selectively reduce the sampling rate or sample size of only some of the channels; they need not all have the same characteristics.

The third option involves the use of MLP (Meridian Lossless Packing), developed by Meridian Audio. Unlike most other high-resolution audio-compression algorithms, MLP is lossless. In most cases, MLP in conjunction with buffering will enable six-channel, 24-bit, 96-kHz audio storage (normally requiring a 13.8-Mbit bit rate) to fit within DVD-Audio's peak transfer-rate envelope (
Table 2).
MLP also provides a means by which you can extend the per-disc playback time (
Table 3). Its use on DVD-Audio discs is optional, but its presence on DVD-Audio bit-stream decoders is required. DVD-Audio format standardization is by no means complete; the DVD Forum is, for example, finalizing specifications for DVD-AR (DVD-Audio Recordable). DVD-AR reportedly comprehends not only linear and packed (MLP-encoded) PCM (pulse-code-modulated) audio, but also six lossy formats: Dolby Digital, DTS, MPEG-1 (and MPEG-2) Layer II, ATRAC-3, MP3PRO (a backward-compatible superset of MP3), and MPEG-2 AAC (Advanced Audio Coding).

Digital watermarking differences are among the least significant of the variations between DVD-Audio and SACD. The brainchild of CD pioneers Philips and Sony, SACD employs a physical, rather than perceptual, watermark that therefore doesn't alter the audio characteristics. Although many first-generation SACDs delivered only two-channel audio, an increasing number of discs contain surround-sound mixes, and many SACDs are multilayer hybrids that also work on CD playersÑalbeit in a 16-bit, 44.1-kHz two-channel fashion (
Figure 3). Turning your attention to streaming media, you should first be aware of the 24-bit MP3 decoders L3Dec and MAD (MPEG Audio Decoder). They claim to reduce the distortion caused by rounding approximations in traditional 16-bit MP3 decoders, and you can either play the results unaltered on a 24-bit-capable sound system or dither them down to 16-bit versions.

Microsoft's WMA (Windows Media Audio) Professional is the newcomer in high-resolution audio. The company adapted its base WMA codec to handle as many as eight audio channels along with larger-than-16-bit sample sizes and greater-than-48 kHz sampling rates. WMA has undergone numerous quality-improving revisions during its brief life and is currently at version 9. Two-channel WMA (WMA Consumer) encoders create bit streams compatible with decoders stretching all the way back to version 2. Similarly, Microsoft hopes to "freeze" the WMA Professional bit stream at this initial version, so consumer-electronics customers can embed the necessary decoding hardware and software without fear of future obsolescence.
Conversion segmentation Audio, like any other human-sensory-input mechanism, is inherently an analog medium, both as it travels toward microphones during capture and as it travels out of transducers during playback. But, with the exception of the enduring cassette tape, all of today's prevalent audio-storage and -distribution mediums are digital: CDs and DVDs, DAT (digital audio tape) and Minidiscs, along with AAC, MP3, RealAudio, WAV, WMA, and other file formats. Clearly, some conversion is going on, both to (with ADCs) and from (with DACs) the digital domain, as well as within (with SRCs, or sample rate converters) to bring all of the incoming digital data to a common sample rate prior to tackling mixing and other audio-processing functions.
Peruse the myriad ADC, DAC, ADC-plus-DAC (codec), and SRC options available from companies such as AKM Semiconductor, Analog Devices, Cirrus Logic, Texas Instruments, and Wolfson Micro-electronics, and you may walk away with a severe headache. A tremendous diversity of alternatives exists; one obvious differentiator is the number of integrated channels. You'll also discover that a few decibels' difference in claimed dynamic range or THD (total harmonic distortion) plus noise, both inside and outside the audible frequency range, can significantly affect the price you'll pay (
Figure 4). First, though, you'll need to ask yourself if you even believe the vendors' specs.
An ADC might integrate one or multiple S/PDIF (Sony/Philips Digital Interface) transmitters. A DAC might include S/PDIF receivers, global or per-channel digital volume control, or the capability to directly interact with both multibit PCM and single-bit (SACD) inputs. A direct path from the SACD decoder to the DAC saves you from the added expense; the additional board space; and the audiophile purists' wrath that a separate SACD-to-PCM transcode chip, such as Nippon Precision Circuits' SM5816AF would create. Keep in mind when evaluating your options that for DVD-Audio, only two channels have to support 192-kHz sampling rates.

For practical purposes, ask yourself just how wide the dynamic range and how low the THD plus noise really need to be, given that the codec is just one piece of an audio-processing chain that's constrained by its weakest link. How high is the quality of the speakers that consumers will likely hook up, directly or indirectly, to this piece of equipment? How much degradation will occur through equipment interconnect? What kind of music will the average user listen to? And, perhaps most importantly, what are the characteristics of the anticipated listening environment?
Some examples from recent Cirrus Logic press releases may enlighten you to the many trade-offs you face. In October 2001, Cirrus quoted the CS4362 six-channel DAC at US$5.35 (10,000), and priced the CS4382 8-channel DAC at US$6.50. In May 2002, Cirrus priced the CS5361 ADC, with differential inputs, 114-dB dynamic range, and 105-dB THD plus noise at US$4.95 (10,000). The pin-compatible CS5351, with single-ended inputs, 108-dB dynamic range, and 100-dB THD noise was US$3.95 (
Table 4).
Larger chunks of audio data, flowing into and out of the system at faster rates, make correspondingly greater demands on the processing subsystem. Estimates of the number-crunching needed to decode DTS 96/24, for example, start at 25 MIPS (according to DTS, on the Analog Devices 21065L 32-bit floating-point SHARC DSP) and can rise far above that figure, reflecting DSP-architecture variations (such as the 32-bit integer processing in Analog Devices' Melody 32 DSPs), the use of high-level, inefficient languages to code the algorithms, and other factors. Factors other than a DSP's clock rate are also critical in determining its performance; the amount of embedded memory, for example, is also key. Every time the DSP exceeds the capacities of internal RAM and ROM and must access much slower external memory, sustained performance will suffer.

The emergence of 24-bit audio has fueled the long-running debate over 24- versus 32-bit processors. This trend reaches its peak with Texas Instruments' TMS320DA610 audio DSP, the first member in the company's Aureus line. TI claims that the DA610, which runs at 225 MHz, delivers a mind-boggling 1800 MIPS and 1350 Mflops of performance (under certain conditions).
Regardless of the vendors' claims, you need to make sure there's always enough overhead to handle not only the audio-decoding functions but also various postprocessing tasks, including bass management and other types of speaker compensation, THX adjustments, and surround-sound speaker virtual-ization. If you're using a traditional single-core DSP architecture, such as Analog Devices' SHARC line, Motorola's DSP5636x DSPs, or TI's DA610, you may need to incorporate a second DSP in your design and allocate functions between the two processors.
Alternatively, Cirrus Logic's CS49400 is, all by itself, a dual-core DSP. The CS49400 walks a middle path in the 24- versus 32-bit debate. A 24-bit processor handles decoding, a separate 32-bit DSP handles post-decode functions, and the partitioned DTS 96/24 algorithm splits between the two processors. Motorola recommends the DSP56311 or DSP56321, with their EFCOP (Enhanced Filter Coprocessor) cores, as companion chips for its main DSP5636x audio DSPs.
High-resolution audio support is increasingly appearing in mainstream PCs, not just in high-end machines for professional use. Until the AC'97 specification and silicon undergo another revision, 20-bit, 48-kHz, six-channel audio and 20-bit, 96-kHz, two-channel audio define the upper-end limit for codecs, such as those by SigmaTel. Higher bandwidth PCI-, USB-, and IEEE-1394-based boards and external peripherals fulfill today's ultrasonic audio needs. Creative Labs' THX-certified Audigy 2 line, based on the company's Emu processors, handles the generation of 6.1-channel audio (DTS-ES, Dolby Digital EX, and others); the decoding of DVD-Audio's full range of sample sizes and rates; and recording at 24-bit, 96-kHz quality.
Via Technologies' acquisition of IC Ensemble in late 2000 gave it 24-bit, 96-kHz (with the Envy24 processor) and 24-bit, 192-kHz (with Envy24HT) capability, of which numerous third-party sound-card manufacturers are taking advantage.
References1. Dipert, Brian, "Expanding options bring surround sound to the forefront," EDN, Jan 10, 2002, pg 34.
2. Dipert, Brian, "Now hear this," EDN, Feb 3, 2000, pg 50.
3. Dipert, Brian, "Digital audio breaks the sound barrier," EDN, July 20, 2000, pg 72.
AcknowledgmentsThanks to Analog Devices, Cirrus Logic, and Via Technologies for their useful research materials. Thanks, too, to the companies who provided hardware and software for my hands-on work. I'd also like to acknowledge the members of the AES and participants in the various rec.audio newsgroups for their thought-provoking insights and opinions.
AT A GLANCE. Advanced audio formats' success won't necessarily come from their high-resolution features.
. Format-backward compatibility enables consumers to gradually upgrade their gear.
. Conversions to, from, and within the digital domain are a key piece of the audio puzzle.
. Processing options balance performance with price to deliver a design-specific optimum approach.
. Digital interconnect is user-friendly and high-quality, and it may even deliver the lowest overall cost.
You can contact Technical Editor Brian Dipert at (1) 916-454-5242, Fax (1) 916-454-5101, E-mail
bdipert@edn.com