Trace the evolution of co-emulation and learn about writing testbenches and transactors.
Simulation acceleration with C/C++ testbenches at the signal-level improved execution speed close to two orders of magnitude versus PLI-based co-simulation. Still, the emulator – the strong link in the setup – was held back by the testbench – the weak link – and was prevented from using all its underlying processing power.
The breakthrough came by splitting the testbench itself into two parts. A front-end, written at a higher level of abstraction than RTL and executed on the workstation, would implement whatever verification capability was expected from the testbench. A back-end, written in RTL code and synthesised onto the emulator, would implement the testbench I/O protocols – namely, the state-machines that control the countless DUT I/O pin transitions – a compute-intensive task performed much more efficiently in hardware.
Furthermore, the communication between the front and back ends would come to be multi-cycle transactions instead of signal-level transitions. Today, this is known as a dual-domain environment with transaction-based inter-domain communication. A hardware-based domain, or HDL domain, running in the emulator, and a software-based, or hardware verification language (HVL) domain, executes on the host computer.
Function call-based communication implements transactions, connecting the two parts, both inbound and outbound. The implementation can take several forms, but all should stem from an Accellera standard called SCE-MI, now at version 2.1. SCE-MI is a set of modeling APIs between behavioral models running on a workstation and synthesisable HDL models running on an emulator. The foundation of today’s standard is the SystemVerilog DPI (SV-DPI). The communication between emulator and workstation can be implemented using SCE-MI based DPI import and export functions and tasks, as well as SCE-MI pipe semantics.
The DPI is not afflicted by the drawbacks of the aforementioned PLI standard. Instead, it presents several advantages:
The two domains require two sets of tools, are generally fed different files, and have different requirements. This scenario leads to increased performance, though the acceleration factor is dependent on the size and frequency of the transactions and function calls and other factors (figure 3).
Figure 3: A split transactor converts transactions coming from the testbench into signal-level, protocol-specific sequences required by the DUT, and vice versa. (source: Mentor Graphics).
The overall architecture fits well with an emulator, and is dominated by the emulator that now can run at speed. Appropriately, it is called co-emulation.
Three benefits were anticipated from the co-emulation approach. First, writing a testbench at a higher level of abstraction with fewer lines of code would be easier and less error-prone. Second, the workstation would process such lightweight behavioral code significantly faster. Third, the communication between the simulation front-end and the emulation back-end would move from cycle-based, pin-level synchronisation to function-based, transaction level synchronisation, further reducing stalling of the emulator. And, the bigger the transaction, the fewer synchronisation “interruptions”, resulting in faster execution of the overall setup.
A review of a transactor’s characteristics and a highlight of what is required to implement a co-emulation testbench is in order here. Table 1 compares the characteristics of the two sides.
Table 1: Characteristics of dual domain co-modeling.
The transaction-based testbench (a.k.a. HVL) side is behavioral and untimed. It can be time-aware but should not have explicit time-advancement statements like clock or unit delays. Time advancement is executed on the HDL side, though the testbench can control timing indirectly via remote function and task calls. The testbench may be class-based, like a UVM testbench, but doesn’t need to be – well within a verification engineer’s comfort zone.
The HDL side is synthesisable and must bear the limitations of modern synthesis technology: behavioral constructs are not generally supported, for example.
Mentor Graphics enhanced the capability to write BFMs by developing XRTL (for eXtended RTL), a superset of SystemVerilog RTL. It includes various behavioral constructs, such as implicit state machines, behavioral clock and reset generation, DPI functions, and tasks that can be synthesised onto an emulator. The HDL domain is statically elaborated, a familiar capability for most ASIC designers. Mentor calls this scenario TBX (TestBench Xpress), similar to an accelerated transactor, to enable emulation with modern testbenches.
Transactors allow the emulator to process data continuously with minimal stalling, dramatically increasing overall performance over PLI-based acceleration, and approaching the performance of ICE.
Co-emulation offers several advantages over ICE. It eliminates the need for speed-rate adapters and physical interfaces. With co-emulation, each physical interface is replaced with a virtual/logical transaction-level interface. Likewise, speed-rate adapters, required for ICE, are replaced with protocol-specific transactors.
Unlike speed-rate adapters, transactor models for the latest protocols are readily available off-the-shelf and easily upgraded to accommodate protocol revisions. Vendors and users provide libraries of transactors for standard interface protocols as well as tools to enable the development of custom, proprietary transactors.
It’s also possible to create an emulation-like environment by using transactors to connect the DUT to “virtual devices.” A virtual device is a software model of a peripheral device that runs on the workstation.
An additional merit of co-emulation is remote accessibility. As there are no physical interfaces connected to the emulator, a user can fully use and manage it from anywhere in the world.
Transaction-based acceleration led to speed-ups of three to four orders of magnitude over simulation. It finally gave design teams access to the full performance of the emulator without sacrificing much, if any, of the flexibility/visibility of simulation. Namely, it achieved the best of both worlds.
The dual-domain partitioning is required for co-emulation, but works perfectly well for simulation. The architecture is verification methodology-neutral. It readily fits a methodology like UVM since UVM has largely the same layering principles. The transactor layer is affected here, but the BFM proxies make this largely transparent to the UVM or modern testbench domain.
In terms of verification productivity, the combination of UVM and co-emulation provides horizontal and vertical reuse benefits from UVM, and reuse across simulation, emulation, FPGA, and other platforms.
Platform-portable, emulation-compatible transactors offer a unique combination of performance, accessibility, flexibility, and scalability. Transactors support the development of a realistic system-level test environment for the DUT. They also enable rapid creation of a high-speed, system-level virtual platform by enveloping the emulated DUT with virtual components interacting with its multitude of interfaces.
The use of transactors delivers all the benefits of ICE without the challenges of rate-adapter availability and physical accessibility. No more “spaghetti cables!”
By adopting co-emulation for testbench acceleration, design teams can move their verification strategy up a level of abstraction, and achieve the verification performance and productivity necessary to fully debug and develop the most complex electronic hardware and software-based systems.
- Dr. Lauro Rizzatti is a verification consultant and industry expert on hardware emulation (www.rizzatti.com).
- Dr. Hans van der Schoot is a recognized specialist in verification and emulation technology, and currently engaged in the role of verification architect and methodologist at Mentor Graphics.
- John Stickley is a verification technologist at Mentor Graphics Emulation Division. His research interests are in virtual platforms for system level modeling and design verification.