Getting the right balance for ADAS architectures

Article By : Gil Abraham

Three different classes of ADAS system processing are needed—edge, zonal and central—with three different profiles.

Recent analyst reports reveal expectations that processing for ADAS system and infotainment systems will expand significantly over the next five years. They see advances on multiple fronts, not only in AI but also in general compute, and shifts in how OEMs want to structure electronic content, from edge-based to zonal to central management. A key consideration for any systems builder hoping to benefit from this growth is how to address these diverse auto architecture needs through unified product families.

Yole Développement reports up to 3X growth over the next five years through direct innovation in active safety capabilities and in the digital cockpit. In addition, advances in adjacent technologies and regulation are driving rapid developments in driver and occupant monitoring systems.  One question is where this growth is happening – around sensors at the edge, or around central processing in zones, or central processing in the car. Innovation is still prominent at the edge where new entrants can introduce competitive solutions versus slower-moving centralized systems. Conversely, cost, safety and central software control push for more centralization.

Multiple edge processing nodes still need overall central control for safety

Before ADAS systems appeared, the rapid growth of electronics in cars had driven automotive OEMs to rethink how they wanted to distribute those electronics. Edge sensing has now accelerated that demand. The problem in part was the cost and management of data communication, amplified further by smart sensing – heavy wiring consumes a lot of power to transport data from the edge to consolidated processing.

However, sensor fusion must fuse data from multiple sensor perspectives and types which often doesn’t fit well at the edge or centrally. We need edge AI for fast recognition and data reduction, but communication and fusion now push some AI to zonal processors. Meanwhile, as we move to smarter cars with some level of autonomy, those processors must consolidate distributed inputs under a driving policy manager. This kind of AI cannot be distributed. It must be handled in a central controller, for safety and for a consolidated perspective.

Sensor fusion must fuse data from multiple sensor perspectives and types which often doesn’t fit well at the edge or centrally. We need edge AI for fast recognition and data reduction, but communication and fusion now push some AI to zonal processors. (Image: CEVA)

Clearly, three different classes of ADAS system processing are needed – edge, zonal and central – with three different profiles. Edge AI must continue to be fast and low cost (since there will be many of these around the car), using a single processor delivering up to 5 TOPS. Zonal processors, consolidating input from multiple edge devices must offer a higher level of parallelism and performance, requiring a higher premium multicore implementation running at up to 20 TOPS. Finally, the central driving policy engine must run inferencing against scenario-trained behaviors – and may also need to allow for some level of on-the-fly training. This engine will very likely be a high premium multi-chiplet device, each chiplet a multi-core, supporting up to 200 TOPS or more.

So how should an SoC developer architect a scalable ADAS system?

It is difficult to know yet how revenue opportunities will segment between many low-cost edge devices, with fewer but higher premium zonal devices and perhaps only one high premium central device per car. The smart money seems to be on preparing for opportunities in each segment. Given that, how should SoC product developers architect their solutions?

Training, optimization and infrastructure software represent some of the biggest investments in deploying an ADAS system. Supporting these consistently across a product family then becomes essential to economic success. An edge solution might be tuned for a lighter-weight objective than a zonal or central solution, but it should allow for a dialed-down version of the same core capabilities. This enables a common trained network to be compiled, with different compiler options, and inferred into edge, zonal and central solutions.

Correspondingly, the AI hardware platform should allow for scale-up/down. It should have the same architecture, deployable as a single neural engine or multiple parallel engines, with uniform data traffic control and memory hierarchy optimization, allowing even scale-out to multi-chiplet implementations where needed.

But here’s the real trick. Network developers should be able to use all state-of-the-art AI methods to meet their objective – including better performance with less power – without compromising the ability to scale. There are a number of state-of-the-art AI methods to achieve this objective.

Take Winograd transforms for example, which offers 2X performance at reduced power with little to no precision degradation at greatly reduced word widths, which is a popular option in advanced inferencing. Support is also often available for a wide range of activation and weight data types in fully mixed-precision neural MAC arrays. Turning layer precision can significantly reduce memory requirements and power. Sparsity engines take this a step further, eliminating the need to multiply by zero values that become even more common in low-precision layers. This increases performance whilst also reduces power.

Custom operations – a must-have in state-of-the-art accelerators – can be added in inferencing through external accelerators. Performing the calculations in an embedded vector processing unit at the same level as the native engines is also possible. There are more features that next generation network architectures can exploit, like fully connected layers, RNN, transformers, 3D convolution and matrix decomposition.

A modern AI processor can provide all these capabilities in a central engine without compromising on flexibility to meet future needs. By deploying such a solution, scalability in software and network development from that engine to the zonal engine and to an edge engine is possible. At the same time, the same scalable hardware platform can read the same trained networks, mapped and suitably scaled to each target objective so that design engineers can maximize AI performance through unified product families.

This article was originally published on Embedded.

Gil Abraham, business development director for CEVA’s vision business unit.


Leave a comment