How MCU memory dictates zone and domain ECU architectures

Article By : Sachin Gupta

Automakers around the globe are transitioning from traditional distributed ECU architectures to domain- or zone-based ECU architectures.

User comfort, safety, and driver assistance continues to increase the number of electronic control units (ECUs) in vehicles. However, this ongoing expansion of ECUs brings more challenges for automakers. So, most automakers around the globe are transitioning from traditional distributed ECU architectures to domain- or zone-based ECU architectures.

Domain-based architectures aim to integrate the high-level control for a complete domain (Figure 1-a). Especially in hybrid and electric vehicles (HEVs and EVs), where all functions tightly interact, distributed architectures struggle to manage complexity and the real-time aspects of the system. For example, braking in an EV is about stopping the vehicle and capturing back electromotive force (EMF) to recharge the battery.

Figure 1-a A domain-based architecture aims to integrate a higher level of control. Source: STMicroelectronics

Figure 1-b A zone-based architecture consolidates multiple ECUs onto a single MCU. Source: STMicroelectronics

Zone-based architectures consolidate multiple ECUs from several domains onto a single MCU and reduce the number of wiring harnesses throughout the vehicle (Figure 1-b). Two major factors push OEMs to reduce the number of harnesses in their vehicles: each additional harness adds weight and complexity to the vehicle. Weight is key, as it reduces the number of miles the vehicle can travel from a single charge. Zone-based architectures offer major advantages, especially in the body domain, in eliminating some harnesses. But it isn’t all or nothing: a vehicle can use different architectures for different domains to get the best of both domain- and zone-architectures.

Both zone- and domain-based architectures support the separation of hardware and software lifecycles. Both let manufacturers update and upgrade vehicle software without changes to the components. These new architectures also offer software-defined vehicles that can roll out new capabilities and vehicles in minimal time.

Change in memory requirements for transition

For a start, domain and zone architectures require MCUs that offer much higher compute capability compared to MCUs used in traditional distributed architectures. Domain architectures today need multicore real-time MCUs running at clock speeds as high as 400 MHz. In fact, some MCUs for these architectures have up to six Arm Cortex-R52 cores with up to four cores running in lock-step configuration to perform real-time error checking. These behemoths can have 10 Arm cores in total.

Although MCU cores and operating frequency are commonly referenced specifications by system architects, embedded/on-board non-volatile memory (NVM) capability also has a significant impact on overall system performance and cost. In spite of this, memory specs are some of the most overlooked. For example, two MCUs with the same cores and operating frequency can differ dramatically in compute and power performance as well as reliability based on the type of memory it uses and its speed. Memory type and memory speed also contribute to the MCU’s in-field firmware upgradability and the cost to achieve it.

Embedded non-volatile memory limitations for new architectures

Commonly, in compute systems, non-volatile memory is used to store code and data. Most general-purpose MCUs use embedded flash for this purpose. And this embedded flash is generally a floating-gate or some type of charge-trap NOR flash. Most of these embedded NVMs are very slow and support a maximum frequency even lower than 20 MHz.

For a 25-MHz NVM in a 400-MHz system, the memory requires approximately 15 wait states. So, even though the CPU is running at 400 MHz, it takes 15 cycles to get an instruction from the memory before the CPU can execute it. MCUs use cache to minimize these wait states, although cache is only as good as the program’s hit rate. As a result, frequent cache misses have a significant impact on the overall compute performance due to these flash-wait states.

Over time, innovations improved embedded NOR flash speed. Unfortunately, flash technologies struggle to scale to smaller technology nodes. While most are qualified to 40 nm, a few have been qualified 28 nm, albeit with a notable cost increase due to the difficulties of integrating these memory cells in very complex high-k metal-gate front-end technology.

As they are among the more recently designed controllers, most zone-based MCUs are available at the 28-nm node to maximize integration and allow the larger capacity memories needed to support the very large application sizes. These can be 20 MB or larger in zone and domain architectures. Making matters worse, over-the-air (OTA) firmware upgrades—more on OTA firmware upgrades in the second part of this article series—that these architectures enable require these MCUs to offer at least 40 MB of embedded NVM to support this capability.

That’s why this memory capacity may not be practical at 28 nm for most embedded flash technologies available today. Also, some other scalable embedded NVM technologies cannot be qualified at the high temperatures needed in automotive applications. As a result, some zone MCUs either do not have an embedded NVM or are sold as a dual-die system-in-package (SIP). These MCUs typically have a large RAM and execute code from the RAM. While this solution offers slightly better compute performance than embedded flash, there are several disadvantages for zone- and domain-based applications.

The first disadvantage is the long start-up time needed for the MCU to load the RAM’s contents on start-up. Although it’s OK for the infotainment system to take a little time to come up when the vehicle starts, extended start-up time is a major issue for domain and zone architectures, which manage door control, steering control, lighting, and other critical functions; users expect these to be available immediately. Another disadvantage of RAM is that it consumes more power than NVM.

Moreover, during low-power modes, retaining the entire RAM requires this power-hungry memory to be constantly powered. When data in the RAM isn’t needed and it can be powered down, reloading data while transitioning from low-power mode to active mode comes at the cost of a longer transition time that may not be acceptable in some applications. Piling on, the power budget to reload the RAM is significant and may defeat the purpose of low-power mode if the application transitions into active mode frequently.

Yet another consideration is system cost. RAM is relatively real-estate hungry IP. So, putting a large RAM into an MCU to run application code will be more expensive than embedded NVM. Then, whether the external NVM is integrated in the package itself as a SIP or mounted on the board, it adds cost, making the system costlier. Other disadvantages include system and supply-chain reliability.

In the system, RAM has a higher bit-flip rate—typically due to radiation and generally known as soft error rate (SER)—compared to an NVM. That impacts the system’s reliability. To support the highest levels of reliability, the latest MCUs for automotive applications support end-to-end error correction code (ECC). External NVMs don’t support end-to-end ECC, which results in lower reliability and requires additional mitigation techniques for safety-critical ECUs.

With one exception, all of this relates to both program and data memories. In contrast to program memory though, data memory requires much higher endurance. This higher endurance requirement brings its own challenges. For example, in a floating-gate NOR cell, a tunnel oxide separates the floating gate from the channel (Figure 2).

Figure 2 A tunnel oxide separates the floating gate from the channel in a floating-gate NOR cell. Source: STMicroelectronics

With every write and erase cycle, this oxide degrades and leakage increases, limiting the number of write cycles before the flash becomes unusable and making it unsuitable for a data memory. And scaling the technology to smaller nodes exacerbates the problem. Not scaling tunnel thickness has its own side effects, as large memory blocks embedded in smaller technology nodes still take longer to read, write, and erase.

Embedded flash also takes time to write to, in part because of the requirement of an erase operation to precede the write operation. All these factors adversely affect system performance and are especially painful when the CPU can run at high frequencies and has to wait for the memories.

PCM’s merits in zone and domain architectures

As has happened in the past, innovation and new technologies jump into the breach. Embedded phase change memory (ePCM), which is available in Stellar SR6 devices, addresses the performance requirements of zone and domain MCUs. Figure 3 shows a cross-sectional view of an ePCM cell in fully depleted silicon on insulator (FD-SOI) technology.

Figure 3 This is how phase change memory (PCM) cell structure looks like. Source: STMicroelectronics

A key point that impacts the current generation of zone MCUs and the entire technology and cost roadmap is that the integration of the ePCM storage element is much less expensive than the double poly flash cell that fast 28 nm embedded flash technologies use in automotive applications. Furthermore, the integration of ePCM does not interfere at all with the complex high-k metal-gate transistor structure.

Last, but not least, unlike embedded flash, the write operation in ePCM does not require high voltage. Therefore, ePCM can work with the standard transistors, while flash requires dedicated high-voltage transistors to manage write voltages that can be 10 V or more. All of these factors impact manufacturability and cost.

Unlike NOR or NAND flash, PCM works based on the change in resistivity of germanium antimony tellurium (GST) alloy. This alloy changes resistivity based on rapid temperature change and that resistivity determines bit status. Figure 4 shows how a bit is set or reset in PCM.

Figure 4 The PCM write process shows how a bit is set or reset. Source: STMicroelectronics

So, ePCM offers fast read access and dramatically less write time compared to embedded NOR flash. The dramatic reduction in write time is because ePCM doesn’t need an erase operation before it can be written. This capability also drastically reduces factory programming time in large memory zone and domain MCUs and that brings down manufacturing cost.

Also, ePCM offers reliability and endurance benefits comparable to embedded flash. At the same time, ePCM allows single-bit alterability that mimics a true EEPROM. This significantly reduces system write time. Moreover, because it operates only on a target bit, single-bit writes don’t impact the life of neighboring memory cells. Therefore, even with endurance levels comparable to embedded flash, PCM effectively allows a greater number of writes for emulated EEPROM in the data NVM.

Editor’s Note: The second article of this series about automotive designs will outline the role of NVM in over-the-air (OTA) firmware upgrades, old technology limitations, and how ePCM addresses them.

This article was originally published on EDN.

Sachin Gupta is product manager for automotive and IoT designs at STMicroelectronics.


Leave a comment