Analog memory embedded in neural net processing SoCs

Article By : Majeed Ahmad

A neuromorphic memory stores synaptic weights in the on-chip floating gate to reduce system latency and deliver 10 to 20 times lower power.

A new breed of system-on-chips (SoCs) serving speech recognition, voice-print recognition, and deep speech noise reduction is starting to employ analog in-memory computing solutions for simultaneously performing neural network computation and storing weights.

Unlike traditional processors that use DSP- and SRAM/DRAM-based approaches for storing and executing machine learning models, neuromorphic memory stores synaptic weights in the on-chip floating gate and is optimized to perform vector-matrix multiplication (VMM) for neural networks. That facilitates significant improvements in system latency and delivers 10 to 20 times lower power.

Figure 1 A neuromorphic memory technology like memBrain optimizes vector-matrix multiplication (VMM) for neural network inference through an analog compute-in-memory approach. Source: Silicon Storage Technology (SST)

Case in point: WITINMEM’s ultra-low-power neural processing SoC streamlines and improves speech processing performance in intelligent voice applications. It has incorporated the memBrain neuromorphic memory solution from Silicon Storage Technology (SST), a subsidiary of Microchip Technology.

Figure 2 WITINMEM’s neural processing SoC enables sub-mA systems to reduce speech noise and recognize hundreds of command words in real-time and immediately after power-up. Source: Silicon Storage Technology (SST)

The current neural net models may require 50 M or more synapses or weights for processing, and that’s too much for an off-chip DRAM. It creates a bottleneck for neural net computing and leads to an increase in overall compute power needs. Here, computing-in-memory technologies like SST’s SuperFlash memBrain eliminate the massive data communications bottlenecks by both storing the neural model weights as values in the memory array and using the memory array as the neural compute element.

By permanently storing neural models inside the memBrain solution’s processing element, this analog in-memory computing solution also supports instant-on functionality for real-time neural network processing. Moreover, as external DRAM and NOR are not required, it significantly reduces the overall bill of materials (BOM) cost.

This article was originally published on Planet Analog.

Majeed Ahmad, Editor-in-Chief of EDN and Planet Analog, has covered the electronics design industry for more than two decades.


Leave a comment