Xilinx integrates stacked HBM to address bandwidth and security

Article By : Xilinx Inc.

New member of Xilinx Versal portfolio integrates stacked high bandwidth memory (HBM) to accelerate compute on massive connected data sets.

Xilinx Inc. has upped its game in addressing performance bottlenecks in networking and data centers, with a new series in its Versal adaptive compute acceleration platform (ACAP) portfolio that integrates high bandwidth memory (HBM) to enable fast compute acceleration for massive, connected data sets with fewer and lower cost servers.

Its new Versal HBM series integrates advanced HBM2e DRAM, providing 820GB/s of throughput and 32GB of capacity for 8X more memory bandwidth and 63% lower power than DDR5 implementations (a comparison based on a typical system implementation of four DDR5-6400 components).  Xilinx said the Versal HBM series is architected to keep up with the higher memory needs of the most compute intensive, memory bound applications for data center, wired networking, test and measurement, and aerospace and defense.

Xilinx Data Movement & Adaptive Processing
Xilinx said the new Versal HBM series is architected for fast data movement and adaptive processing. (Source: Xilinx)

In a briefing with embedded.com, Mike Thompson, the senior product line manager for Xilinx Versal FPGAs, said, “There are three major trends at the moment: exponential growth of network traffic and data to be processed; DDR bandwidth availability which leads to performance bottlenecks; and the third is data security. Versal HBM increases the capacity of each of these three layers, particularly since bandwidth and security requirements are outpacing current processing and memory technologies.”

The Versal HBM series utilizes high-bandwidth memory integrated using stacked silicon interconnect (SSI) based on TSMC’s CoWoS (chip on wafer on substrate) 3D stacking technology. Thompson said this heterogenous integration is a key part of addressing the so-called end of Moore’s Law. He said traditional architectures are bottlenecked on memory and network access for real-time applications.

Xilinx HBM integration in Versal
The Versal HBM swaps out one super logic region (SLR) from the Versal Premium, to drop in the HBM2e stack, and another SLR to add an integrated HBM controller. (Source: Xilinx)

The Versal HBM series uses the foundation provided by Xilinx Versal Premium, but swaps out one super logic region (SLR) in the device to swap in the HBM2e stack, and another SLR to add an integrated HBM controller. This enables an architecture for fast data movement and adaptive processing through the integration of a networked intellectual property (IP) and memory subsystem. Thompson indicated that the Versal HBM integrates 14 equivalent FPGAs (compared with Xilinx Virtex EltraScale+), and replaces 32 DDR5 chips with integrated HBM.

Xilinx Versal HBM replaces multiple devices
The Versal HBM integrates 14 equivalent FPGAs (compared with Xilinx Virtex EltraScale+), and replaces 32 DDR5 chips with integrated HBM. (Source: Xilinx)

The new HBM platform incorporates power-optimized networking cores for high bandwidth, secure connectivity. The Versal HBM series offers 5.6Tb/s of serial bandwidth with 112Gb/s PAM4 transceivers, 2.4Tb/s of scalable Ethernet bandwidth, 1.2Tb/s of line rate encryption throughput, 600Gb/s of Interlaken connectivity, and 1.5Tb/s of PCIe Gen5 bandwidth with built-in DMA, supporting both CCIX and CXL. This broad set of hardened IP provides off-the-shelf, multi-terabit networked connectivity for a breadth of protocols, data rates, and optical standards, enabling optimal power and performance and fast time to market.

As an adaptive, heterogeneous compute platform, the Versal HBM series is engineered to accelerate a wide range of workloads with large data sets, integrating adaptable engines for low-latency hardware parallelism, DSP engines for AI inference and signal processing, and scalar engines for embedded compute, platform management, and secure boot and configuration. Unlike fixed function accelerators, the Versal HBM series can dynamically reconfigure hardware in milliseconds to adapt with evolving algorithms and emerging protocols, eliminating the need for hardware redesign and re-deployment. Thompson told us “This with adaptable compute is important for agile design.”

This convergence of adaptable compute with high bandwidth memory and multi-terabit connectivity enables next-generation cloud acceleration and secure networking. Versal HBM ACAPs deliver good performance and power efficiency for big data workloads including fraud detection, recommendation engines, database acceleration, data analytics, financial modeling, and deep learning inference for natural language processing (NLP). By improving runtimes by orders of magnitude over modern server-class CPUs, while supporting 4X larger data sets, users can deploy applications with massive, connected data sets with far fewer and lower cost servers.

Faster run times on bigger data sets Xilinx Versal HBM
Versal HBM ACAPs deliver good performance and power efficiency for big data workloads including fraud detection and recommendation engines. (Source: Xilinx)

Similarly, Versal HBM ACAPs deliver network scalability and performance for 800G routers, switches, and security appliances. A traditional network processing unit (NPU) implementation of an 800G next-generation firewall would require multiple NPU devices and DDR modules, whereas a single Versal HBM ACAP eliminates external memories and performs packet processing, security processing, and adaptable AI-infused anomaly detection at dramatically lower power and at a fraction of the form factor. The series delivers major CapEx and OpEx savings for cloud and network providers by enabling customers to use fewer devices and systems to implement their applications.

Accessible to both hardware and software developers, Versal HBM ACAPs provide a design-entry point for any developer, including Vivado Design Suite for hardware developers, the Vitis unified software platform for software developers, and Vitis AI for data scientists with domain-specific frameworks and acceleration libraries.

The Versal HBM series is built on the foundation of production-proven 7nm Versal devices. Developers can start prototyping on Versal Premium series devices and evaluation boards and readily migrate to the Versal HBM series. The Versal HBM series will begin sampling in the first half of 2022. Documentation is available now and tools will be available in the second half of 2021 via an early access program.

This article was originally published on Embedded.

Nitin Dahad is a correspondent for EE Times, EE Times Europe and also Editor-in-Chief of embedded.com. With 35 years in the electronics industry, he’s had many different roles: from engineer to journalist, and from entrepreneur to startup mentor and government advisor. He was part of the startup team that launched 32-bit microprocessor company ARC International in the US in the late 1990s and took it public, and co-founder of The Chilli, which influenced much of the tech startup scene in the early 2000s. He’s also worked with many of the big names—including National Semiconductor, GEC Plessey Semiconductors, Dialog Semiconductor and Marconi Instruments.

 

Leave a comment