Interest is growing in voice control for electronic products, and a good design has to start with a good microphone

By Richard Quinnell, Editor-in-Chief

For decades, researchers have been trying to give computers the ability to understand human speech, and it’s finally ready for prime time. With the advent of personal voice assistants like Apple’s Siri, the Amazon Echo, and Google Home, voice control of electronic systems is now becoming a must-have feature. Starting with the right microphone is the key to achieving optimal performance in such designs.

The technology of choice for voice-controlled devices is the microelectromechanical systems (MEMS) microphone. These have essentially replaced older electret condenser microphones for many reasons. For one thing, MEMS microphones are small — as little as 2.5 x 1.6 x 0.9 mm. More importantly, however, they can offer stable performance that does not drift over time and can have better phase matching to allow accurate beamforming.

A MEMS microphone works by capacitance. The micromachining of silicon produces an acoustic chamber with a flexible plate as one wall, and pressure waves (sound) can move that plate. The changing capacitance between the plate and the rest of the chamber creates an electrical signal that represents the sound. Two types of MEMS microphones exist. Analog microphones, like their larger counterparts, simply provide the sensor signal — perhaps conditioned or filtered, but essentially unaltered. Analog microphones thus require use of an external ADC if they are to be used in machine voice recognition. Digital microphones include the ADC and other digital elements onboard to convert the sensor signal into a digital data stream, typically pulse-density-modulated.

MEMS-Fig-1-MEMS-structure

Fig. 1: A MEMS microphone is essentially a variable capacitor made by micromachining a movable silicon plate in an acoustic chamber. (Source: STMicroelectronics)

MEMS microphone technology first saw adoption in mobile devices and laptop computers, and these applications are still driving the market. Most devices have multiple microphones positioned in places such as near the bottom for phone voice pickup and adjacent to camera lenses for video sound pickup. These microphones, however, are intended to capture sound for human listening, which gives designers considerable freedom in applying data compression and other filter algorithms. For machine listening, however, audio processing needs are different.

Before evaluating a MEMS microphone for voice control, developers first need to decide how the system is to be used. A battery-powered, handheld device such as a remote control, for instance, will probably always be held near the speaker’s mouth. The design may thus require only one microphone, and acoustic parameters such as signal-to-noise ratio (SNR) may not be as big a consideration as power consumption. But in a device such as the Amazon Echo, which must reliably understand voices coming from meters away in a noisy environment, key parameters would be both the SNL (which determines the softest sound reliably sensed) and the acoustic overload point (AOP), which is the loudest sound pressure level (SPL) the sensor can handle without saturating.

A second upfront consideration is where and how the microphone is to be mounted. MEMS microphones come in two orientations. There is the top port, with the sound inlet aperture pointed away from the mounting surface, and the bottom port. With the bottom port, the PCB to which the microphone gets mounted must have a thru-hole that aligns to the aperture in order for the sound to enter. While that may seem a more complicated approach, bottom-port microphones currently dominate the market. Mobile device designers have been mounting the microphone on flexible circuits that are then attached to the device’s housing, and in these designs, the bottom port design greatly simplifies assembly.

MEMS-Fig-2-IM69D130_LLGA-5-1_Combi

Fig. 2: A bottom-port microphone has its acoustic aperture on the same side as its PCB mounting pads.

Also, in considering the mounting, vendors warn that the industrial design’s acoustic properties need attention. Even the best microphone will deliver poor performance if the housing’s sound aperture and the enclosure’s resonant chamber are not appropriate. Depending on application, there may also need to be a gasket or other barrier to prevent dust or water from entering the microphone’s sound aperture.

Fortunately, MEMS microphone vendors can help. Vendors like Infineon, for instance, partner with acoustic specialists to offer reference designs that customers can leverage. Others, like STMicroelectronics, offer acoustic simulation services to help validate customer designs based on the customer’s 3D drawings.

With these basic considerations out of the way, designers can then evaluate the performance characteristics of individual microphones to make a final selection. The key parameters for a representative selection of available MEMS microphones are listed in a spreadsheet, downloadable by registered EP readers. Some of these specifications include:

  • Port sensitivity — the output signal magnitude given a reference 94-dB -SPL acoustic signal at the port. This helps indicate the softest sounds that the microphone will pick up.
  • SNR — the difference (in dB) between the microphone’s noise floor and the signal produced by a 94-dB, 1-kHz sound wave.
  • Dynamic range — the spread of sound intensity levels that the microphone can reliably capture without distortion. It is essentially the difference (in dB) between the microphone’s noise floor and its AOP.
  • Frequency range — the audio frequency range to which the microphone can respond without loss of sensitivity.

MEMS-Fig-3-Guide

Fig. 3: Click on the “Download Guides” button at the bottom of this article to download this MEMS Microphone Selection Guide.

Some vendors have also indicated special features of their microphone offerings that developers can consider. One of the more intriguing is a low-power mode that allows a microphone to operate in a less sensitive but always-on state. Operating in this state allows the design to conserve power while remaining active in order to detect a “wake-up” word, such as “Alexa.” Once the wake-up word is detected, the system can then switch to full-sensitivity (and full-power) operation. This mode can be especially valuable for battery-operated designs because it helps extend battery life.

MEMS microphones are only the first element of voice-activated system designs, however. They must often be followed by signal processing that combines signals from multiple microphones for noise reduction. Noise-reduction techniques include beamforming (multiple microphones needed), which focuses the system’s acoustic response in the direction of the speaker, and “barge-in” operation, which cancels out the sound that the device is itself generating so the speaker does not need to shout over it to be heard.

This article only scratches the surface of MEMS microphone operation and evaluation, of course. Fortunately, there are many vendor tutorials and other guidelines online for further study. These include:

You can also learn about some of the voice signal processing options and their development kits by following the article links below.

Related articles:
Audio Pre-processing System Reference Design for Voice-based Applications Using C6747
How to build your own Amazon Echo — or something like it
AcuEdge dev kit simplifies Amazon Alexa Voice Service integration
XMOS and Qualcomm launch voice-processing dev kits for cloud-based speech recognition
Cirrus joins chorus of Alexa pre-processing chip providers