IC designs keep growing in complexity, but AI and ML can shrink design cycles.
Artificial intelligence (AI) and machine learning (ML) come in many shapes, but whatever the intelligence looks like, it is all results-focused. If there is a clear “right way” and “wrong way” to do something, AI needs to demonstrate an ability to follow the “right way.” More pertinently, systems that employ AI must work out how to get there on their own and get better at doing it over time.
Electronic design automation (EDA) work is the ideal task for AI. The complexity of integrated circuits (ICs) means the number of possible design iterations that need to be evaluated continues to increase, but their regularity means design rules that work well can have a massive positive impact across large parts of the design. Using AI and ML to move from ‘maybe’ to ‘definitely’ in fewer iterative steps can deliver greater productivity in an automated flow, as this article explains.
Why do we need ML, and why now?
In EDA, results are everything. The industry is in a constant state of development to expedite the design process. As fabrication processes shrink in dimensions, the ICs become commensurately more complex, and, as any design engineer appreciates, complexity increases the design cycle. That is true for any type of design, whether working with discrete components or integrated transistors. Fortunately for IC design engineers, the process of developing at the transistor-level follows particularly strict and relatively predictable rules. The complexity is a facet of the co-dependency between design parameters. While iteration will ultimately exhaust all possible combinations of those co-dependencies, this only increases design time.
What’s so great about machine learning?
Machine learning, or ML, is revolutionary and disruptive, and not because it is new, but because it does what good technology has always done since the first industrial revolution. It allows us to do things better, and, most importantly, faster. Subsequently, this means it becomes cost-effective to do more of these things and on a bigger scale.
Knowledge and experience play an incredibly important role in reducing that time. This is where ML can be applied, to increase the productivity of experienced engineers (see sidebar, “What’s so great about machine learning?”). It is applied in two ways—”inside” and “outside.” ML inside is used behind the scenes to reduce the time it takes to arrive at design closure, while with ML outside, expert systems are used to close the loop on iterative design, which is still a very manual process and dependent on the availability and capability of the designer. Both types of ML/AI are applicable to EDA, and both are going to be increasingly important in the future of IC design. There are various different methods used when implementing an ML capability. Figure 1 shows how AI, ML, and deep learning (DL) can be combined into a complete solution.
The size of the design challenge is relative to the size of the design; the size of ICs is growing rapidly, but the number of IC design engineers is not matching that pace. As a result, the design challenge is increasing at a rate that is more exponential than linear. This effectively means that there is a net loss in productivity that is getting bigger every day.
As foundry process dimensions continue to shrink, transistor density increases, in line with the latest node. At 7nm, it is no longer possible to create blocks with “just” 2 million cells. In fact, 5, 10, or even 15 million cell blocks are becoming the norm, and ICs can easily integrate 50 plus such blocks.
Larger blocks equate to more complexity, even if the design can be described as step-and-repeat to some degree. The result is two-fold: larger blocks and more of them on a chip. Keeping pace with this escalation necessitates higher design efficiency and, today, that can be most effectively addressed using ML.
It would be fair to say that ML has become a buzzword (or phrase), and in such circumstances, the word in question can begin to lose meaning. One widely accepted definition of ML and AI includes reference to an application being able to operate without being explicitly programmed to do the task in question.
It is this term, “explicitly programmed,” that makes ML so applicable to design automation, precisely because so much of the design process is still relatively manual. Today, it is not possible to automate every aspect of the design process in a programmatic way because so much of it does rely on the experience of the engineer. Without ML, the EDA industry does a good job of making the engineer more productive; with ML, the EDA industry will be able to do a great job.
The expeditious advancement in ML has been enabled in part by the increased levels of processing power now available, which has, in turn, allowed computer scientists to make huge advancements in the way ML works—a virtuous circle that very much echoes the way the semiconductor industry functions in general.
EDA is a very natural home for ML, thanks to its unique demands. It requires iterative ‘what-if’ scenarios to be evaluated, but more importantly perhaps is the predictability of those scenarios. In ML terminology, inference is incredibly powerful, because it allows a model to arrive at a result without having to employ every single point in a dataset. This allows ML to be trained in a relatively short time to deliver improved results.
As an example, one of the biggest challenges that chip designers face today relates to the placing and routing of those huge (and still growing) blocks, with respect to timing. In the early stages of implementation, getting to a point where design placement and optimization can begin in earnest depends on predicting whether the timing requirements can be met with a given floorplan. This “chicken and egg” situation relies on the engineer having enough confidence in the predicted results to spend the time necessary in routing the design in order to achieve the predicted timing results.
At this early stage of the design flow, designers need to achieve results fast with enough accuracy based on a small amount of physical design data. At the end of the place and route (P&R) process, there is much more physical data available, so timing prediction becomes much more accurate. However, that accuracy comes at the cost of runtime (Figure 2).
What does scale mean in AI?
Interestingly, scale isn’t necessarily the most important part of ML. Yes, by its nature it is inherently scalable because, in general, once training is complete, the model can be reused ad infinitum. But mass reproduction of artificially intelligent “things” doing the same “thing” many times over is not the only application for ML. It could be argued that, by its nature, ML needs to be applied to unique problems to fully exploit its potential to “learn.”
At each new advanced FinFET process node, there is an increasing number of physical design rules to account for and honor. Many of these rules will impact timing prediction in some way. The aim here is to use ML at the beginning of the flow to predict with a high enough level of accuracy the timing for a given floor plan (see sidebar, “What does scaling mean in AI?”). As expressed earlier, each node has more relevant rules, but, of course, every design can, and likely will, be significantly different from any other design that has been carried out before it. For example, one design may use 10 metal layers, while another may use 15; this huge variation has massive implications on timing. It is this variability that makes it so difficult to explicitly program a solution for every conceivable design on every possible foundry node.
While it may be difficult to process this amount of data, it is exactly the kind of data that can be used to train an ML model. Every design will generate pre-route and post-route data, but more importantly, the data will also contain the evolution of that design, showing what the engineer did to hit timing objectives.
These data can be used to train the ML model inside, so when it encounters that design or one very similar to it, it will be able to predict—with greater accuracy—where to place blocks initially in a shorter amount of processing time. More accurate results at the pre-route stage will deliver better results faster at the post-route stage. Figure 3 shows how timing data from existing designs can be used to train a new ML-based timing model. The training process can be accelerated by using specialized hardware, such as GPUs, or run in the cloud.
As with most ML models, the more training it receives, the better it gets at predicting the desired result. In this case, even the data from a single design block can give the model enough information to produce better results.
A key point to highlight is that the trained model is only relevant to a specific type of block. If that block is a CPU on a 7nm process, it would make no sense and return no benefits to use the same model to predict post-route timing results for a GPU on a 12nm process, for example. In this way, the aggregation of data from all designs will not return a better model for any design; instead, it must be relevant to the design in hand.
Each time the model is trained on design data, it produces a model that will accelerate the pre-route process for that design or one that is very close to it. All of the training takes place inside. In other words, the data never leaves the design environment, and clearly it would be of little use in any other design flow.
This is a perfect example of how explicit programming doesn’t contribute to reducing turnaround time because very large-scale ICs now feature tens of complex blocks, each containing millions of cells that have many more millions of individual transistors within them. Every single element of the design, right down to the characteristics of the individual transistor, is configurable and every part of every element will have a small but not insignificant influence on the overall design. Meeting this seemingly insurmountable design challenge is where EDA has always excelled, but adding ML is taking the industry to the next paradigm.
Acceleration vs accuracy
This specificity of the trained model is extremely important because it means that once the model has been trained on one block in a design, it can be used on all of the subsequent blocks in that design. As many ICs will now have tens of blocks and each one will be iteratively designed to achieve the performance required as features evolve, the pre-route acceleration and post-route accuracy that ML will bring is going to be beneficial.
As mentioned earlier, the number of blocks needed to train the model are minimal, and, in fact, more doesn’t necessarily mean better. There will be diminishing returns, so it is not necessary to train a model on every block in a design.
This also highlights another key feature. In AI terms, training is typically carried out using very large data sets on very powerful servers, often taking a considerable amount of time. With this approach, training can be carried out locally on the development machines, including GPU accelerators on a relatively small data set and completed in hours rather than days.
While each design comes with different specifications, in general, engineering teams are seeing beneficial results when using between five and 10 blocks to train the model, and those blocks will typically be different but intended for the same system. In addition, the training process itself can be iterative, so as more design data is generated, the model can be incrementally trained further on the data that comes out of the improved block, which itself was generated using a trained model.
Pre- to post-route correlation needs to be more than just accurate. Accurately predicting the post-route timing is only meaningful if it also delivers improvements. In simplistic terms, an improvement means better.
To quantify that, in EDA and IC design, there are several key parameters that engineers need to really focus on. These are invariably total negative slack (TNS), worst negative slack (WNS), and power. Together, these can be referred to simply as power, performance, and area (PPA). The positive impact that ML has on PPA is resulting in a faster turnaround time, or TAT.
Negative slack is effectively an indication that a signal with timing constraints is reaching its destination too late (indicated by a negative number). The WNS represents the single worst timing endpoint in a network, while TNS is the combination of all negative slack figures for a given network.
Figure 4 shows how using the ML generated timing model at the beginning of the P&R flow can enable improved final timing prediction, resulting in better design performance.
One of the biggest challenges all designers face is feature creep; adding new functionality late in the design process. This is a fact of life and one that is very much a result of companies operating in a competitive landscape. Often, feature creep will mean that much of the design effort is lost, and they almost certainly will not be able to benefit from the time and effort that went into the initial design.
ML changes that because all of the design effort put into optimising each block now has a residual value that can be realized by training the model that delivers faster and more accurate pre-routing data and post-routing predictions. Figure 5 shows some typical 7nm design performance improvements using a trained ML timing prediction model, compared with a traditional P&R flow. All key design metrics show ML benefits.
Without some way of directly benefiting from the entire design process, the time and cost of iterative design can quickly derail a project—even without introducing feature creep. But with ML in the flow, every design decision has greater value.
As mentioned earlier, ML and AI can be applied in more than one way to EDA. While ML inside is about improving the results from one part of the flow, ML outside is about accelerating the entire design flow. This is where the experience of the design team has the greatest impact because this is, historically, the part of the design flow that requires the highest amount of manual interaction. For this reason, putting that knowledge into an expert system has the potential to significantly accelerate the entire flow.
The use of DL analytics has the potential to provide design assistance here, by doing what an experienced engineer would perhaps do instinctively. During a typical design cycle, the block implementation is done many times as the RTL is refined to add new features and resolve any verification issues. Each iteration of the block implementation generates a vast amount of data, which when analyzed quickly, can help improve the results from the next iteration.
Today this is done manually by experienced engineers, but manual intervention is neither practical nor cost effective. The human brain cannot comprehend all the information being created during a single iteration in a timely manner and find the areas of optimization or violation, let alone, multiple iterations of a design cycle. However, data analytics is an ideal way to process enormous quantities of design data and find patterns that will enable the design flow to be improved during each iteration.
ML outside has great potential to provide guidance and automation for the whole design flow, increasing designer productivity as block complexity grows.
ML is inherently applicable to EDA design flows. Its value is, in part, thanks to its ability to operate without being explicitly programmed. For EDA, the value goes much further. Models that can be trained on a single design block can be used almost immediately to improve the post-route accuracy of pre-route data. This, in turn, leads to significant improvements in TNS, WNS, and PPA in general.
IC design has always been a compute- and data-intensive activity, and thanks to ML that data can now be fed back into the flow in a way that accelerates the design process by delivering better results faster and ultimately reducing the turnaround time for chip design.
Rod Metcalfe is a member of the product management team at Cadence and is responsible for digital implementation tools.