Bookmark and Share Printer-friendly version Email to a Friend

Using bit-error-rate testers to test-drive FEC codes

( 01 Dec 2003 )
by Tom Waschura, SyntheSys Research

FEC (forward error correct-ion), a critical component of many modern digital-communications applications, turns otherwise-unusable links into real and practical systems. From DVDs to cell phones, satellite TV to disk drives, error-correction technology is a mathematical marvel that effectively makes a silk purse from a sow's ear. Figure 1 shows a simplified diagram of a communications channel with FEC encoding and decoding.


Approaches that correct for bit errors during digital communica-tion vary from simple error-detection mechanisms, to non-real-time correction capability, to real-time, on-the-fly error correction. Choices among approaches depend on the system requirements and statistics of the anticipated errors. Faced with the need to correct occasional random single-bit errors, you might select an error-correction strategy unsuited to rare, short, multiple-bit error bursts. Single long-error-burst events might indicate a different approach that requires large amounts of buffering and could introduce unacceptable latencies. The trade-offs that you must make in specifying efficient, effective error correction require you to know or anticipate system- and application-performance requirements.

Before designing an error-correction process, you must fully understand the types of errors that are typical in the system. The best way to get this information is by collecting error statistics during various typical-use cases. In the past, error statistics were limited to simple bit-error-rate averages, which offer little insight into the design of error-correction strategies. Bit-error-rate testers that capture the exact bit location of detected errors provide the precise statistics you need to make these choices. Examples of statistics that help in this determination include:

  • separate measurements of bit- and burst-error rates;

  • probability distributions of different burst lengths;

  • populations of the number of data blocks that contain different numbers of errors; and

  • the distribution of error-free intervals between errors.


If you use these statistics together with system requirements, these measurements provide the data necessary to make an intelligent design choice.

For example, Hamming codes, popular in memory arrays, are well-suited for high-probability, random, 1-bit errors in short code words. Maximum-likelihood codes, of which Viterbi trellis detectors are a subset, reduce 1-bit errors caused by white noise. Fire codes, which tape-drive and floppy-disk systems use, offer efficient and fast correction of rare, single-burst errors that have a length of less than 7 to 15 bits. Product-array RS (Reed-Solomon) codes, which everything from CD-ROMs to deep-space communications use, provide effective correction for potentially long error bursts at the expense of large buffers and processing latency.

Add data; then, take it away
The math behind error-correction codes is based on the concept of adding some information to the transmitted message so that messages received with error become less likely than messages received without error.Generally, you can think of a message with added FEC information as a code word. Sometimes, the FEC information is just appended to the end of the message (for example, CRC, parity, and checksums). In other cases, the FEC information is convolved with the data to create an entirely new message (for example, Viterbi and 8b/10bÑ8-bit/10-bitÑcodes).

Because the chosen strategy determines the FEC decoder's complexity, a wrong decision on error-correction type can add considerable system complexity and significantly increase the system-design effort. The complexity defines the inherent latency, processing requirements, misdetection and correction probabilities, and error-propagation modes. For example, floppy-disk drives can use firmware and only simple hardware-CRC error detectors to correct single-sector, small-burst errors. When the detector catches a CRC error, reading slows, and the software takes over to attempt a small correction using the result of the CRC computation. This approach can work effectively because the errors are rare and the system has no real-time requirement. On the other hand, a digital-video-tape player cannot pause playback to fix an error. In this case, the playback unit must correct the error in real time. The choice of error-correction strategy must reflect the real-life error statistics.

Identifying and recording the exact bit locations of a channel's detected errors allows bit-error-rate testers to easily
emulate proposed error-correction strategies. The simplest case is an RS-type block code. RS block codes form the basis of some of the most popular FEC systems, including satellite broadcast, underwater fiber optics, digital tape recording, and deep-space communications. These codes append 2T symbols of overhead to a message of length k symbols to make a message of total length kn=k+2T symbols. This code, sometimes referred to as an RS(n,k) code, can correct T erroneous symbols, regardless of their location in the message.

For example, MPEG-2 (Motion Picture Experts Group 2) data used for DVB (digital-video B) satellite broadcasting uses a 30- to 90Mbps RS(204,188) code, allowing for correction of as many as eight byte symbols. The detector decodes in real time every block of 204 bytes it receives. As long as there are fewer than eight byte errors, the decoder removes all errors and provides perfect video. If more than eight byte errors exist, the error detector can't remove the errors, and image problems occur.
Sorting and counting errors
To see how many errors land in individual code words, analysis functions integrated into bit-error-rate testers can sort and count detected errors' exact bit locations according to user-defined error-correction parameters. For example, in DVB MPEG-2 data, errors can accumulate on 204-byte boundaries. Whenever the number of symbol errors within the 204-byte block is less than or equal to eight, you can remove the errors from further error counting and analysis, because an error corrector would have removed them. This type of analysis computes the corrected error rate by counting errors only when the error rate exceeds eight byte errors per 204-byte packet (Table 1).

Symbol size is the first parameter that you must define before you use a bit-error-rate tester to perform this type of analysis. Symbol sizes of 8 to 10 bits are commonplace. The rest of the analysis ignores individual bit errors in favor of symbol errors. The bit-error-rate tester considers a symbol to be in error if it contains one or more bit errors. It can readily compute symbol-error statistics when it knows the exact locations of all bit errors in the data stream.
Individual code words in RS block codes can correct only a relatively small number of symbol errors. Increasing this number causes a large increase in the amount of code overhead as well as a large increase in the processing power and time necessary to make real-time corrections. When errors tend to occur in small or large bursts, an alternative exists to increasing the RS block code's T value. By interleaving the data in a memory buffer, you can implement this alternative, which improves the correction capability at the expense of more latency.

Interleaving attempts to split apart burst errors so that their individual symbol errors fall into multiple code words. A 14-symbol burst error confronting a single RS(204,188) code causes the code to fail. However, by splitting apart every other byte and handing it to two separate RS(204,188) codes, the same T=8 correction logic can correct the entire error. The penalty is that the receiver must wait to get two complete 204-byte code words before it can start working on the correction. This latency is unnoticeable in some systems (for example, streaming applications, such as digital-video playback and deep-space satellite receivers). However, in other transactional systems (for example, networking packets), this latency severely limits the code's usability.

Interleaving and sorting
Bit-error analysis can easily emulate interleaving, which is a simple sorting function. Typically, you form interleaves by specifying the number of simultaneously filled code words. For example, an interleave of four RS(204,188) 8-bit symbol codes comprises a table of four rows with 204 bytes in each row (Figure 2). This table represents 6528 bits. When a bit error occurs, the location information determines where in the table the error would have occurred. Once all 6528 data bits are present, a row-by-row examination of the table determines whether any row contains more than eight symbol errors.


The removal, before counting, of errors in all rows that have eight or fewer erroneous symbols virtually implements the error correction that would occur. Thus, the rate of the remaining errors reveals the postcorrection error performance.

Other examples of this 2-D interleave with a 1-D correcting code are the ITU (International Telecommunications Union) standard G.709 and G.975 codes for fiber-optic communications. For example, G.709 calls out 8-bit symbols with a T=8, RS (256,239) code interleaved across 16 rows. G.975 calls out a similar code with only four rows of interleave.

You can also use multiple dimensions of block codes to enable a relatively simple RS block code, such as one with a small T value, to correct large burst errors. However, because it requires two levels of correction and the full table must be present before corrections can start, this approach further increases the latency between data reception and decoded-data output. Digital-video recorders use this technique because latency is not an issue and large bursts are common. Once a table is full of code words, this architecture first corrects each individual row and then corrects individual columns. As long as fewer than T rows fail, the column-corrector corrects all errors in these rows. This technique offers a good trade-off for correcting both random and burst-error components.

Cases in which random errors pose no problem can employ another technique if they require the ultimate in correction of long error bursts. Because it uses one symbol to find an error and another symbol to correct the error, RS coding must append 2T symbols to a message to accomplish only T corrections. With known error locations, however, the code can use all symbols for error correction and can thus achieve twice the correction efficiency. For example, when using a 2-D product-array code, the inner-code decoder could find the rows that contain errors. As long as the number of such rows is less than 2T, the decoder can mark these rows as containing errors and allow the outer-code decoder to blindly correct each row. Depending on the method of filling the interleave table, this mechanism can double the correctable-error-burst length. Communications engineers often refer to this technique as using an inner-code failure to erase the outer code.

A bit-error-rate tester can easily analyze all of these block-code-based architectures. Interleave-table dimensions and filling/draining algorithms can adapt to any of these approaches. Because overall corrector effectiveness can drop dramatically if channels suffer from phenomena that the model does not include, emulating FEC algorithms with a bit-error-rate tester offers the advantage of using real error data from a digital channel for analysis instead of relying on a hypothetical model for error statistics.

Error-location analysis
An example should clarify the role of a bit-error-rate tester in optimizing FEC coding. The example begins with an uncorrected data channel that has a general average background error rate of 2.68310Ð6, in which the errors come from both burst and nonburst components. Error distributions, which you obtain by applying various error-location-analysis techniques, reveal the bursts to be random and correlated.Figure 3 presents an error map of a portion of this digital channel. The error map breaks the data into segments and places the segments next to each other to build a 2-D image of the bit-error information. The view highlights the locations of detected bit errors. Separately, differently coloring errors from bit and burst phenomena gives better insight into the cause. Figure 4 shows that both bit and burst errors are present and that some errors are highly correlated with the segment length (the horizontal "band"), which equals the system's "natural" data-packet size.


Before going much further, it is appropriate to address another point about burst errors. A probability distribution of burst lengths alone is inadequate to determine an FEC code's required correction strength. Often, you find burst errors in the presence of other background errors. Additionally, burst errors can highly correlate with themselves, so that one burst predicts the future presence of another. In these cases, a single FEC code word may encounter more than just a single burst error. Block-error statistics, error-free-interval probabilities, and error autocorrelations can provide better insight into the rest of the error story. Ultimately, though, actual FEC emulation based on bit-error position is the most precise way to study FEC effectiveness before building hardware.


To design an error corrector for this data, you first use a simple 8-bit-symbol long 1-D block code, RS(204,196), which is a T=4 corrector. This code removes symbol errors when their rate is less than four byte errors per block of 204 bytes. To perform this analysis inside a BERT equipped with FEC emulation, you must enable this type of error corrector and set up the parameters in Table 2.

This analysis shows that the simple FEC improves the error rate to 8.55310Ð7. To implement this code, you must add a modest latency to the data availability to allow time to buffer the 204 data bytes and process them for correction, which typically requires a second pass through the data. Additionally, an added overhead of 4% is required to achieve this improvement. Figure 5 shows the error map of the post-corrected data. As anticipated, the small errors are virtually removed, and the larger burst errors persist.

To improve burst-error correction, you can add a simple five-row interleave. Adding this interleave increases the buffering requirement and latency of data availability, so minimize this interleave depth. To add the interleave, change the FEC parameters to those shown in Table 3.

With this code, you would expect to correct single isolated bursts as long as 20 bytes. The T=0 setting in the second code indicates that there is no correction in the outer code for this analysis. These parameters interleave the data in 2-D but correct in 1-D. Using these settings, the corrected error rate drops to 2.64310Ð8. Figure 5 shows the error map for this interleaved error correction. Note that this enhancement still uses the same basic T=4 decoder technology but precedes it with an interleave to more evenly distribute errors among a small group of FEC code words.

You could continue this approach using deeper interleaves and stronger correction strengths to achieve significant error-rate improvements at the cost of complex correction decoders and long latencies. Using error-location statistics in bit-error-rate testers, you can easily explore error-correction strategies to make smart choices about
the processing requirements, overheads, and latencies you need to create digital channels for tomorrow's innovative products.

Author Information
Tom Waschura holds degrees from Stanford University, MIT, and Hiram College. He has developed inventions that are incorporated in products from SyntheSys Research (www.synthesysresearch.com) and Agilent (www.agilent.com).

Box story: a note to purists
Purist mathematicians won't appreciate several simplifications that this article and underlying analysis use. For example, the analysis assumes that the Reed-Solomon correction ability is based only on the location and quantity of errors within a code word. In truth, a decoder can sometimes incorrectly detect or correct an error. Fortunately, such occurrences are orders of magnitude less likely than the usual cases.

 
Printer-friendly version Email to a Friend
Article Rating 
Average Rate: No rating yet
 
Poor Quite Good Good Very Good Excellent
 
 
Related Content 
 
 
ADVERTISEMENT
 
 
ON-DEMAND WEBCASTS

 
Highest Rated  
 
 
 
 
ADVERTISEMENT
 
 


TECHNOLOGY NEWS
 
 
 
PRODUCT NEWS
 
FEATURED SPONSORS
 
 
 
DESIGN CENTERS
 
ADVERTISEMENT
 
     
CURRENT ISSUE
 
COVER STORY:

Analog design in the 21st century: challenges, tools, and IC advances

We are now more than a decade into the 21st century, and on an ever-accelerating fast track to technological innovation in electronics. The transistor and progression into the IC, or microchip, lit the fuse leading to the explosion of innovations in electronics that is now taking place. Since the wi ...
HIGHLIGHTS:
SPECIAL REPORT
DESIGN FEATURES
 
PULSE
 
 
 
 


 


RSS
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   

POLL
What type of environmental regulation do you think will be most beneficial for the tech industry?
Proper recycling and disposal
Push for power efficiency and energy conservation
Chemical/lead regulation
View results

 
 
 
 
 
 
Power Technology E-newsletter 
Power.org Releases Power Architecture 32-bit Application Binary Interface Supplement
EDNA, May 11
POL Regulators Designed for Energy-efficient Computing
EDNA, March 11
Fairchild Revolutionizes Power Savings
EDNA, January 11
Lattice Transforms Board Power and Digital Management
EDNA, November 10
 
Analog E-newsletter 
12V Dual-channel Synchronous Buck Converter Features Integrated FETs
EDNA, February 10
Power MOSFETs features reduced top-side thermal impedanc
EDNA, January 10
 

 
KNOWLEDGE CENTER
 
Texas Instruments: DaVinci™ Technology
 
Texas Instruments: Safe Bet Series
 
 
INDUSTRY LINKS
 
Photonics Association (Singapore)
Singapore Industrial Automation Association (SIAA)
Taiwan Semiconductor Industry Association (TSIA)
 
 
 
 
OUR SPONSORS
 







Keithley Instruments
With more than 60 years of measurement expertise, Keithley Instruments has become a world leader in advanced electrical test instruments and systems from DC to RF (radio frequency). Our products solve emerging measurement needs in production testing, process monitoring, product development, and research...
 
 
 
     
 

EDN India | EDN Taiwan | EDN Korea | EDN Japan | EDN China | EDN | EDN Europe

 
ABOUT EDN Asia | | CONTACT US
   
© 2012 EDN Asia All rights reserved.