FPGAs for SoC design verification

Article By : Deepak Shankar, Anupurba Mukherjee, and Tom Jose, Mirabilis Design Inc.

Using FPGAs to verify the SoC design is a powerful tool and is becoming a very important part of semiconductor design.

System-on-chip (SoC) integration stands behind the success of the semiconductor industry to continue to meet its goals for better, smaller, and faster chips. Multifarious tools are used for the design and verification of electronic systems. Verification is one the most important aspects, as it demonstrates the functional correctness of a design. Using FPGAs to verify the SoC design is a powerful tool and is becoming a very important part of semiconductor design.

Traditional methodologies are not sufficient to fully verify a system. There is a compelling reason to do dynamic timing analysis. EDA vendors provide solutions for basic simulation that adequately support low-density devices. These tools are not powerful enough to debug and verify to the extent that is required by designers of today to meet the cutthroat competition in meeting schedules and debugging effectively in large-scale FPGAs.

A few architecture exploration tools resolve the issues related to verification by reusing the system model to test the timing, power, and functionality based on real-time and realistic workloads. They address the multitude of drawbacks in the current verification solutions. We will discuss addressing each particular issue and how each one was resolved.

Problems behind current SoC verification

With the increases in size and complexity of SoCs, there is a growing need for more effective verification tools. Excessive competition is shrinking time-to-market requirements. This makes it difficult for designers to use the traditional approach of implement-and-test designs in hardware.

Functional simulation tests only the functional capabilities of RTL design. What they essentially do is send a generic set of inputs, and they test for those scenarios and determine whether it worked or not. It fails to provide timing, power consumption, and responses to workloads from the rest of the system.

Static analysis fails to find problems that can be seen when the design is run dynamically. Timing analysis methodology also has various drawbacks. In a real system, dynamic factors can cause timing violations on the SoC. It can tell the user whether the design can meet setup and whether the applied timing constraints are met. One example is in the design of time-sensitive network routers, in which care must be taken to specify the priority levels that could use the time slots. Care must also be taken so that a priority level packet doesn’t use the resources allocated for another time slot. For instance, control data frames are of Priority Level 3. In the current time slot, Class A packets start using resources. While Class A frames are being transferred, the next time slot initiates and the packets scheduled during this new time slot (control data frames) have to wait until the current transfer gets completed. Static analysis tools will never be able to find this problem. Similarly, with packet routing through a common crossbar in a network, packets could end up getting dropped. So proper flow control mechanism should be in place. Static timing analysis will not be able to find this problem.

Another methodology for verification is in-system testing. If the design works on the board and passes the test suites, then it is ready to be released. But some issues like timing violations may not appear immediately, and by that time, the design is already in the hands of the client.

Obtaining accuracy with an architecture simulator

To describe this new system verification solution, we use a commercial architecture simulator called VisualSim Architect from Mirabilis Design. Architecture simulators have traditionally been used for system specification validation, tradeoffs, and architecture coverage. The new trend is to expand the role of these simulators by integrating them with FPGA boards and emulators. SystemC and Verilog simulators provide a similar approach, but their system models are either too detailed or extremely abstract. They do not accurately capture the system scenarios at a simulation performance that enables large-scale testing.

A logical function or IP is considered a block in the graphical architecture model. Most architecture simulators have a diverse library of components that cover the ASIC design blocks implemented in RTL. The environment allows for the capture of the entire architecture, which allows the user to view what the entire system is doing. The system can be just inside a chip or full box or a network. The architecture contains the necessary details that help generate buffer occupancy, timing, or power consumption. It also provides information about the response of the overall system after the replacement of a block in the design. For instance, if the memory block is replaced with an emulator or an FPGA, then what will be the impact on the rest of the system?

Meeting the timing deadline is critical to the success of any design. So for example, a block is expected to complete its simulation within 20 µs, but it is observed that it takes 20 ms of time. As a result, the rest of the system suffers. Such details are captured and the user gets insights of the timing of each and every implementation on the FPGA, relative to the rest of the system. Failing to meet timing deadlines may also result in failing of the test that is required to make it usable in their product environment.

The second interesting feature is the reduction in cost and time spent testing each block or IP using a test chip. A test chip may cost $200,000 of NRE, about $200 to $300 for packaging and other support activities, and six to nine months for testing. With VisualSim, the user may load up the RTL for the particular IP block, replacing the current architecture block with that particular FPGA block. The C++ API connects to the particular FPGA block. This facilitates the user to keep the same architecture environment and overview the activities of the chip or the entire system. It includes the performance, timing, and latency.

The IP that is put onto the FPGA will eventually go into the product. This indicates that the user will be testing the IP in the context of real architecture. It verifies whether the IP is going to work in the system. Also, additional information like performance or power consumption will be incurred. So it’s not limited to just verifying the single IP but rather verifying the whole SoC with the IP being in that block or in the FPGA.

Solutions that resolve your dilemmas

Most engineers have compelling reasons for refusing to do timing simulation. Some of the main concerns are:

  • It is time-consuming. Time is one of the most critical factors for the success of any design. It is time-consuming if you have built the timing model from scratch. But the idea here is to reuse the architecture model in the VisualSim environment. It serves two purposes. The architecture model can be enhanced with greater accuracy and better evaluation of existing, in-house, or purchased IP. This also helps to conduct timing analysis of an IP for which the code is available, which is the reason why the time taken is very short.
  • It takes a lot of memory and processor power to be able to verify. Designers have preferred component-based simulation instead of simulation of one big design. Divide and conquer was introduced because a single FPGA board can run only a small piece of the chip. Simulation of one big design will consume lot of processor power, memory, and FPGA capacity. For instance, to simulate a full SoC, the user might need 2,000 FPGAs. Putting this many FPGAs on a board and running it is very hard. So the divide-and-conquer concept was welcomed by most designers hoping that each part would work after assembly. Current tools also have lots of underlying details that are not useful for verification, which slows down the simulation of the entire design. Secondly, the “Keep Hierarchy” solution allows the design to maintain its hierarchy even if it goes through implementation. It takes up each part of a processor, for instance, and makes a hierarchy. But most of the current tools provides one to two levels of hierarchy.
    Architecture modeling environments such as VisualSim can model the entire SoC or board. The environment tests all the functionalities and completes it in a very short period of time (an hour or two), the reason being that it abstracts clocks and signals and reuses architecture blocks, which makes the simulation much faster. The FPGA board needs to contain only the specific IP to be tested. Also, the simulator runs at 80 million events per second and over 40,000 instruction (not cycles) per second for the entire SoC. Moreover, it can run on a common Linux server. Hence, the cost can be kept low by using off-the-shelf systems. It also creates 30 to 40 levels of hierarchy. Each hierarchical component can be a reusable component. The model is built by these small hierarchical components.
  • There is no way to re-use the testbench from the functional simulation. New testbenches have to be created. The environment is reusing the architecture model. The architecture model has timing, function, and power in the same model. It has all the required statistics such as latency, throughput, power consumption, efficiency, utility of service, waveform, etc. As it has all the details, it can be easily reused for both timing and functionality.
  • Debugging the design turns out to be a chore, as the whole netlist is flattened and there is no way to single out the problem in a timely manner. The environment aids the debugging effort. It provides a lot of probes on the test environment side. Also, the architecture model can be used as a reference to compare the output from the FPGA board. Thus, viewing which sequence has the error becomes a lot easier. So it is possible to narrow down the bug.
  • Timing simulation shows the worst-case numbers. The design has enough slack to not be concerned. Commercial tools like VisualSim run a cycle-accurate simulation, which has been proven to get timing at 90% to 95% accuracy and power at 85% to 98% accuracy. So even extremely high throughput and timing deadline designs can be accurately tested.
  • Not all the sub-modules are coded at the same site. There is no way to split out the parts that are coded at each site, as the designers of these parts will be the ones who understand the design better in order to verify it. The entire architecture is captured in VisualSim. Each remote team can replace their design with the full system model in VisualSim. This way, each team can test independently and multiple teams can combine to test as well.

Some highlighted benefits:

  • Cuts costs and time in testing
  • Multi-level hierarchy
  • Reuses the architecture model
  • The entire SoC or board can be modeled
  • Tests independently
  • Aids the debugging effort

Looking into the prospects

Reusing the architectural model saves enormous amount of time. Component-based simulation is an old concept now. Verification of the whole SoC in very short periods of time can save both time and cost. Tools like VisualSim could capture the entire architecture, which allows the user to view what the entire system is doing. Additional information like performance, timing, and latency is information that will give the user an idea of how the final design is going to work. The simulations done are extremely fast and may take one to two hours of time in contrast to traditional methods that may take days to verify.

This article was originally published on EEWeb.

Deepak Shankar is the founder of Mirabilis Design Inc.

Anupurba Mukherjee is a product market engineer at Mirabilis Design Inc.

Tom Jose is an application specialist at Mirabilis Design Inc.

Leave a comment