By Prashanth Narayanasetti, Aish Dubey, Eric Ponsot, Arojit Sunil Roychowdhury, and Sani Dewal, Texas Instruments Inc.; and Tapas Mandal and Ramadas Rajagopal, Magma Design Automation
Texas Instruments Inc. (TI), one of the leading manufacturers of semiconductors worldwide, houses operations in more than 30 countries around the world. With the 1985 opening of an R&D center in Bangalore—India’s "Silicon Valley"—TI became the first global technology company to establish a presence in India. Since that time, TI India engineers have filed more than 700 patents, amongst the highest of any technology company in India.
In collaboration with other TI colleagues across the globe, the TI India team continues to achieve many "firsts" in the technology world. In 1995, for example, TI India's engineers developed the first processor designed in India for control applications. The TI India R&D Center team also played an integral part in designing the industry's first single-chip solution for wireless handsets, which TI launched in 2002.
Today, TI India’s engineers work with the global TI base on cutting-edge design development for a full range of consumer products, including mobile handsets, communications infrastructure (base stations), video (security and surveillance, IP phones, set-top boxes), and high-performance analog devices, to name a few.
One of the most recent products that TI’s India team contributed to was the OMAP 4 processor, designed for mobile devices. This OMAP 4 processor design was considered a flagship product for TI’s India team. Other TI teams across the world depended on each other’s diligent efforts to keep the product ramp on schedule, so there was no room for error or schedule delay. Predictability was a key concern, but this is challenging to achieve with a design involving about nearly 500 top-level clock domains, multiple Multi-Supply, Multi Voltage (MSMV) and Power Shut-Off (PSO) domains, and more than 20 secondary floor plans. Additionally, the Static Timing Analysis (STA) sign-off timing closure required approximately 200 modes and 20 process corners, resulting in about 700 scenarios (mode-corner permutations) that required verification.
Pushing technological boundariesOver the past few years, power consumption has moved to the forefront of consideration and concern in digital IC designs. Engineers can employ a variety of techniques to control a chip's dynamic switching and leakage power consumption. These include MSMV and Power PSO.
Although the use of advanced techniques like MSMV and PSO can dramatically reduce power consumption, they also significantly increase the complexity associated with designing such a device.
How TI engineers achieved successFigure 1 illustrates a high-level representation of a conventional design flow. Following design capture and synthesis, the cells and macros are placed and a global route is performed. Previously, these tasks were performed independently. In today's complex designs, however, timing- and congestion-driven placement and global routing go hand-in-hand.
Next, the design is optimized by "tweaking" individual placements and routes. Although there may be hundreds of mode-corner scenarios that require evaluation by the sign off extraction and timing analysis tools, implementation tools need to work with a reduced set due to run time and memory issues. Thus, a much smaller number of scenarios must be used in the front-end of the process, where this reduced set of scenarios "embrace" the full set.
Evaluating scenarios in isolation is not a designer’s first choice, since modifying a placement or route to address an issue in one scenario may introduce problems in others. Thus, modern design flows employ Multi-Mode, Multi-Corner (MMMC) optimization that concurrently handles all scenarios.
Most designs require only three to five scenarios for front-end evaluation, but the OMAP 4 processor required nine scenarios to be accounted for, which – for a design of this size – would stretch most MMMC tools beyond limits. Given this, there was concern that existing design technologies would not be capable of handling a design of this complexity. The engineers at TI India collaborated with Magma Design Automation (Magma) to achieve success.
Raising the bar for tomorrow’s designsThe currently deployed global router in Magma's digital implementation solution, the Talus HGR (Hierarchical Global Router), is extremely good in terms of timing and Quality-of-Results (QoR), and has proven success with the largest, most complex designs at the 90nm, 65nm, 45nm, and 40nm technology nodes.
The Magma and TI India teams worked together to provide early access to a new routing technology called Talus GRX. Based on revolutionary new algorithms developed at the University of Bonn, Talus GRX offers improved timing, reduced wire lengths (reduced power), buffer counts (reduced area and power), congestion, and limit violations.
Talus GRX is extremely efficient with regards to threading and its use of multiple processors, which dramatically speeds the global routing process. Also, as opposed to simply attempting to pack everything together as closely as possible, Talus GRX makes very efficient use of "white space" and is highly proficient with tasks like layer assignment. Overall, Talus GRX is very well suited to address the complexity of today's designs, especially those featuring complex shaped MSMV and PSO domains. The end result is that Talus GRX is extremely fast and offers very high QoR. Consider the comparison between Talus HGR and Talus GRX used on one of TI’s designs as illustrated in Figure 2.
Another critical part of Magma’s design flow involved the use of correlation in the Loop (CITL), as Figure 3 illustrates. CITL involves correlating the extraction and timing analysis used in placement, routing, and optimization with the extraction and timing analysis used for sign off. This correlation can be performed using a test chip or the actual design. Once CITL is established for a particular process technology, it can be applied to future designs using the same process technology. Although CITL involves up-front work, it dramatically reduces the number of final timing ECOs that need to be fixed, and improves schedule predictability.
TI’s India team utilized the combination of Talus’ existing technology along with Talus GRX, MMMC, and CITL. This reduced the number of failed end points (FEPs) from about 77,000 to a manageable 3,000. In turn, this helped to reduce main timing closure loops from between 15 and 20 for a previous version of the chip to only three to five for this new implementation. Furthermore, this new flow reduced the overall turn-around time for a timing closure loop by 50 percent from start to finish.
The end result of using the new flow was to increase productivity, increase predictability, and increase the overall quality of the design results. Now sampling, the OMAP 4 processor "sets the bar" by which future designs will be measured. The TI India team was proud to be part of such a revolutionary design cycle that showcases the innovation happening inside TI’s worldwide base.
Caption
Figure 1: A high-level representation of a conventional design flow.
Figure 2: Comparison between Talus HGR and Talus GRX (wire density > 90 percent is highlighted).
Figure 3: A high-level representation of Magma's next-generation Talus design flow.