A design guide for implementing audio signal detection

Article By : Viacheslav Kolsun

Here is a sneak peek into principles and device implementation to detect audio signals and distinguish noise and no-signals from true audio signals.

Sound can be represented with both analog or digital audio signals. Analog audio signals use electrical voltage levels. Different types of transducers convert sound to electrical signals and electrical signals to sound. The audio signal frequency range is roughly 20 Hz to 20,000 Hz.

Sources such as microphones and loudspeakers produce or receive audio signals, but it’s also possible that the signal is white noise or single-tone noise. These can be caused by issues in electrical circuits and have a frequency which falls within the audio frequency range. There may also be no signal at all. These possibilities must be considered when detecting audio signals in order to distinguish noise and no-signals from true audio signals such a human speech, music, and natural sound.

Principles of audio signal detection

The human ear can hear frequencies in the approximate range of 20 Hz to 20,000 Hz. This range can include single tones such as transformer hum or white noise from radio systems. That’s hardly to say that these sounds are desirable in audio systems; a high level of such sounds can damage hearing. Human speech, music, and natural sounds have different frequencies that vary continuously. Therefore, the audio detector should register the frequency variations and pick useful audio signals based on these variations.

Figure 1 This is how audio signal detection works. Source: Dialog Semiconductor

The basic theory behind audio signal detection is shown in Figure 1. The system design considers three reference frequencies: 100 Hz, 500 Hz and 3 kHz. For a given signal, the system counts the number of times the frequency of the signal crosses the reference frequencies in a certain period of time. Only crosses from low to high frequencies are considered; for instance, 50 Hz to 150 Hz will count for 100 Hz and 150 Hz to 50 Hz will not. The design considers the signal as audio if it crosses any of the two reference frequencies a minimum number of times, as specified in Table 1.

Table 1 Minimum frequency crossings to detect audio signals; these numbers can be adjusted according to user needs through I2C. Source: Dialog Semiconductor

There are three sample signals shown in Figure 1:

  1. The noise which crosses 3 kHz three times (shown in black).
  2. The single tone hum which doesn’t cross any frequencies (shown in red).
  3. The signal which varies like speech or music (shown in green). It crosses 100 Hz six times, 500 Hz five times, and 3 kHz one time. This curve crosses all three reference frequencies, though the device doesn’t detect 3 kHz because it only crosses one time; it must cross two or more times for detection, as shown in Table 1. The device detects 500 Hz (it crosses five times; two is the minimum in Table 1) and 100 Hz (crosses six times; four is the minimum in Table 1). Since it crosses two of the reference signals a sufficient number of times, the signal is detected as audio.

Note that speech or music can have pauses. There is a famous composition by John Milton Cage Jr. called 4’33” which is performed with the absence of any sound. Naturally, the design can’t determine such a long pause as an audio, though a pause less than 5 seconds will be ignored by the detecting algorithm.

Finally, the design should cut inaudible frequencies—less than 20 Hz and more than 20 kHz. We will use these principles as the basis for designing an audio signal detector while employing the SLG47502 programmable mixed-signal chip.

Detection device implementation

Design architecture

The architecture of this device is shown in Figure 2 and contains the following building blocks:

  1. Quantization of the analog audio signal. This maps the continuous analog values to double values. All that is needed to know after this process is the frequency of the audio signal.
  2. High cut filter. This ignores frequencies higher than 20 kHz.
  3. Low cut filter. This ignores frequencies lower than 25 Hz.
  4. Frequency crossing counter. This counts the number of crossings of signal frequencies and reference frequencies—high frequency, mid frequency, low frequency—in a certain period of time (measuring time) according to Table 1.
  5. Audio pause. This detects audio pauses and ignores them if less than 5 seconds.
  6. Measuring time. The given period of time during which calculations are made.
  7. D-FlipFlop (DFF). This stores audio detection during the measuring time and outputs it to PIN12 (AudioDetect).
  8. Five Minutes No Audio Signal. This detects a five-minute idle time of the audio signal and sets a high level on PIN11 (FiveMinutesNoAudioSignal).

Figure 2 The device architecture diagram highlights the major building blocks. Source: Dialog Semiconductor

Block configuration

Analog part: The source of the audio signal should be connected to PIN9 (AUDIO_IN-) and PIN10 (AUDIO_IN+). PIN10 (AUDIO_IN+) is an input of the analog comparator (ACMP). PIN9 (AUDIO_IN-) is a reference voltage (500 mV). Taking into account the fact that the audio signal is an alternating signal and the IC is single voltage-supplied, the design biases the input audio signal by 500 mV to avoid negative voltage. Afterward, the input audio signal goes to ACMP0H (Figure 3). ACMP0H quantizes the audio signal, which is handled with the remaining part of the design.

Figure 3 The analog part represents the source of an audio signal comprising analog comparator and reference voltage pins. Source: Dialog Semiconductor

High cut filter: A delay (8-bit CNT7/DLY7 (MF7)) is used to filter out frequencies higher than 20 kHz (Figure 4). Design engineers can adjust the period of the frequency by writing Counter Data to 0xA0 <1287:1280> through I2C.

Figure 4 A high cut filter employs a delay to filter out frequencies higher than 20 kHz. Source: Dialog Semiconductor

Low cut filter: The low-cut filter shown in Figure 5 consists of two parts:

  1. Deglitch filter. Taking into account the fact that there are no CNT/DLY blocks to filter random glitches, a decision was made to implement a deglitch filter with a look-up table (3-bit LUT8), shift register (SHR 13), and DFF (DFF12). The designer can adjust the time of random pulses, writing Counter Data to 0x69 <845:842> through I2C.
  2. Low cut filter is implemented with a frequency detector (CNT5/DLY5) which cuts off frequencies lower than 25 Hz. The designer can adjust the cutting period of frequency, writing Counter Data to 0x94 <1191:1184> through I2C.

Figure 5 A low cut filter comprises a deglitch filter and a frequency detector. Source: Dialog Semiconductor

Frequency crossing counter: This block consists of several parts. The first part is EDGE DET (Figure 6). It converts a double-level audio signal to a series of short pulses which save the frequency of the current audio signal. The next step is detecting the crossing of the current frequency of the audio signal with the reference frequencies, as shown in Table 2 and Figure 7.

Figure 6 The first part of the frequency crossing counter converts a double-level audio signal to a series of short pulses. Source: Dialog Semiconductor

Table 2 During frequency detection, the crossing frequencies can be updated through I2C. Source: Dialog Semiconductor

Counting the number of frequency crossings with the reference frequencies is carried out by the shift registers (SHR7, SHR8, SHR9).

Figure 7 This is how the crossing of the current frequency of the audio signal with the reference frequencies is detected. Source: Dialog Semiconductor

Audio pause: The audio pause block is implemented with the frequency detector, as highlighted in Figure 8 and Table 3. The pause of the audio signal is detected with this block and ignored if it’s less than 5 seconds. The audio signal is considered continuous. If the pause is more than 5 seconds, the design detects this as no audio signal at all.

Figure 8 The audio pause block is implemented with a frequency detector. Source: Dialog Semiconductor

Table 3 Audio pause data; the crossing frequencies can be updated through I2C. Source: Dialog Semiconductor

Measuring time: The design counts the number of crossings of reference frequencies at a specific time which is controlled by a counter, as highlighted in Figure 9 and Table 4. If the frequency crossing counter doesn’t detect an audio signal—including audio pause—during the measuring time, the design identifies it as no-signal.

Figure 9 Measuring time block counts the number of crossings of reference frequencies at a specific time. Source: Dialog Semiconductor

Table 4 The measuring time data relates to the number of crossings of reference frequencies. Source: Dialog Semiconductor

Audio signal presence storage: Audio signal presence storage is carried out by DFF0, as shown in Figure 2. The signal is set using P DLY—mode is both edge delay—and LUT (3-bit LUT13).

No-audio signal: If the design doesn’t detect any audio signal during ~5 minutes, then it sets a high level on PIN11 (FiveMinutesAudioPause). Counting this time is carried out with an LUT (3-bit LUT3) and a delay (CNT6/DLY6). This time is set according to Table 5.

Table 5 Counting no-audio time is carried out according to this information. Source: Dialog Semiconductor

Typical application circuit

Figure 10 The above diagram shows a typical application circuit. Source: Dialog Semiconductor

Hardware testing

Channel 1 (yellow, top)—PIN#10 (AUDIO_IN+)

Channel 2 (blue, bottom)—PIN#12 (AudioDetect)

Ground of oscilloscope is connected to PIN9 (AUDIO_IN-)

Figure 11 Waveforms show testing with a record playing (a) and testing with FM radio tuning (b).

Audio detector design

The article describes the design of an audio detector with the programmable mixed-signal chip SLG47502. The proposed method is based on the changing frequency of an audio signal. If the frequency of the input signal changes a certain number of times, then the device identifies this signal as audio. The design makes allowances for pauses in audio. If no audio signal is identified within five minutes, then the device sets a high level on PIN11. If the level of the input signal is relatively low, then this design cannot identify audio.

The complete design file created in the GreenPAK Designer software can be found here.

This article was originally published on Planet Analog.

Viacheslav Kolsun is application engineer at Dialog Semiconductor.

 

Leave a comment