Building security into AI SoCs at silicon level

Article By : Marco Ciaffi and John Min

With the rapid deployment of artificial intelligence (AI), the focus of AI system on chip (SoC) design has been on building smarter, faster and cheaper devices.

With the rapid deployment of artificial intelligence (AI), the focus of AI system on chip (SoC) design has been on building smarter, faster and cheaper devices rather than safer, trusted, and more secure. It’s an issue highlighted quite well by one senior figure involved in national security policy, Jason Matheny, who has said, “AI systems are not being developed with a focus on the evolving threat landscape, despite ample funding from both the government and private sectors. In fact, less than one percent of funding is going toward security.” Matheny is founding director of the Center for Security and Emerging Technology at Georgetown University, and a commissioner on the National Security Commission on Artificial Intelligence

Before we look at how to build security into AI SoCs at silicon level, consider what an AI system is. It comprises three elements:

  1. an inference engine that processes data, makes decisions, and sends commands;
  2. training data and a set of weights created during the machine learning phase;
  3. the physical device that carries out the commands.

For example, a Nest thermostat can set a user’s preferred temperature by analyzing and learning from the user’s behavior. Eventually, it can predict that the user likes to set the temperature 10 degrees cooler at night, and the inference engine will then send a command to the thermostat to lower the temperature at the same time every day.

Security threats in an AI system

The biggest threats to an AI system fall into two categories: control-oriented and data-oriented attacks.  A control-oriented attack happens when an attacker exploits a common software vulnerability, like a buffer overflow, and take over the system. As the name suggests, a control-oriented attack takes control of the AI device to carry out the attacker’s commands. Whether that’s exporting private data. Or something more ominous, such as forcing an autonomous vehicle to crash into the guardrail on the highway.

In a data-oriented attack an attacker manipulates either the training data of an AI system, or the real-world data that the system uses to make decisions, causing the AI device to malfunction and do something that it shouldn’t, based on false data. That could be anything as harmless as subverting a spam filter to something as malicious as causing a self-driving vehicle to not recognize a stop sign.

Using RISC-V’s inherent security capabilities

By combining the security capability built into a CPU architecture and then adding hardware and software layers around the CPU, SoC designers can move closer to a design that has security built into its basic fabric.

So, how can SoC designers take advantage of such a capability in their designs?  The first step is to select a CPU architecture that anticipates the common forms of security attacks – such as the latest generation RISC-V open-source instruction set architecture (ISA), which was architected with security in mind. Next, add oversight silicon intellectual property (IP) and software that looks for threats before they get to the CPU hardware.

The RISC-V architecture was developed in the era of AI and the security conscious world of today. With its open-source architecture there should be no surprises since the architecture is open to public scrutiny and engineering.

In addition, RISC-V comes with operating modes with varying privileges levels and accesses. The first is machine mode (M-Mode) in which software has full access to machine resources. At boot time after root of trust is established, the machine enters user mode to run user programs. In user mode CPU access is restricted using control interrupt and exception assignment. Two additional RISC-V security architecture features are (1) physical memory protection (PMP) and (2) physical memory attributes (PMA). These allow designers to specify which applications can access memory and how they do so.

Figure 1 RISC-V security - Andes-Dover
Figure 1. Built-in RISC-V security.

Commercial RISC-V IP vendors adds additional functionality that the designer can choose to implement.  For example, Andes Technology adds a stack overflow protection mechanism. To implement this function, the designer determines the maximum size of the program stack his application will require for its normal operation. For example, if the application will require no more than 15 entries. When the application is in the field and executing, if a stack overflow is detected, the CPU generates an exception. The exception may have resulted from a normal event or a malicious attack. In either event, the exception handling software can then determine the culprit.

Another security add-on is programmable memory attribute, which enables the designer to assign memory regions as read-only, write-only, or available to access without restrictions. Additionally, the designer can choose to hide a memory region that contains critical data for the application. If any attempt to access this region occurs, an exception is generated to evaluate what function was attempting to breach the region without authorization. A third addition added by Andes Technology is the ability to hide code from being disassembled and reverse engineered.

Enabling even more secure functionality that’s available in the RISC-V ISA is the ability to add custom extensions. For example, creating custom instructions to speed execution of crypto algorithms. These custom instructions can scramble data and unscramble data anyone unfamiliar with the instruction would have difficulty hacking without a working knowledge of. Another security feature is the creation of private memory storage isolated from main system memory only accessible by the designer’s application software.

Finally, creation of a private bus to access coprocessors, crypto processors, and private memories adds yet another level of security to the RISC-V ISA. This private access keeps the most important data away from the system bus and any application looking to make a covert intrusion. Andes Technology simplifies the task of adding custom extensions and security elements to the RISC-V ISA with its Andes Custom Extensions (ACE) tool. ACE considerably reduces the time needed to make and verify these additions.

Figure 2 Andes custom extensions - Andes-Dover
Figure 2. Andes custom extensions.

In addition to incorporating the security features inherent in RISC-V ISA, along with the additional features provided by Andes as described above, Dover Microsystems’ CoreGuard adds an oversight system around the RISC-V CPU. This acts as a bodyguard to the host RISC-V processor, monitoring every instruction executed and preventing the exploitation of software vulnerabilities. The CoreGuard solution comprises both hardware and software components. The hardware component is the oversight silicon IP that integrates with the host RISC-V core.  Residing in hardware makes it unassailable over the network and running at hardware speeds provides real-time enforcement.

The software portion of the solution has two parts. The first is a set of micropolicies. These define security, safety and privacy rules. The second is metadata, information about the software application protected by the micropolicies. In operation, the host RISC-V processor reads instructions and data that needs to be processed out of memory and sends the instruction trace to the oversight hardware.  The hardware applies the active set of micropolicies with all the associated metadata needed for each micropolicy to make its decision.  If the instruction does not violate a micropolicy, the instruction is executed. However, if a micropolicy has been violated, a violation is issued back to the host to be handled as an exception.

Figure 3 CoreGuard block diagram - Andes-Dover
Figure 3. CoreGuard block diagram.

In addition to stopping the attack before any damage can be done, the security solution provides the host RISC-V processor with information about the precise malicious instruction that was attempting to do something it shouldn’t.

The host can then take a variety of actions based on that information. The default action is a segmentation fault which will terminate the application—however, this is not usually the production option. Other options include, asking for user input, activating address space layout randomization (ASLR) to buy time, or having a separate, “safe” application take over.

For example, if an attacker attempts to exploit vulnerabilities in a package delivery drone navigation software to have all packages rerouted and delivered to one location.  When the micropolicy violation is detected, a safe location could be taken out of protected storage and the drone instructed to fly home to that safe location.

To illustrate how this security solution thwarts control-oriented attacks, let’s consider an autonomous vehicle. If an attacker successfully exploits a buffer overflow vulnerability in the CPU not detected by the hardware countermeasures and attempts to inject code to take full control of the vehicle, CoreGuard’s micropolicy—Heap—stops all buffer overflow attacks—including zero-day threats—thus, shutting the door on an attacker’s entrance path.

This type of attack was well documented in 2015, when researchers successfully took control of a Jeep vehicle. The researchers were able to control everything from audio volume to brakes and steering. In a CoreGuard-protected autonomous vehicle, it could notify the driver to turn off autonomous mode and take control of the vehicle.

In a data-oriented attack, the attacker’s goal is to gain access to the application running the AI inference engine and modify its sensor data causing the system to fail. A common application that uses AI is predictive maintenance to predict when a machine—or aircraft—will need maintenance. It is used across industries to predict anything from mechanical issues on aircrafts, to predicting when an industrial refrigerator at a food production facility will need maintenance.

This application relies on high-quality, accurate data readings. If an attacker is able to manipulate those data readings, the AI system will not have a clear or accurate picture with which to make predictions. For example, this could mean that aircrafts requiring maintenance will go unnoticed and a malfunction could occur mid-flight.

The data that powers AI systems is most vulnerable just before the signing and just after the signature verification process. If an attacker can intercept and alter that data, the AI system could be severely compromised.  CoreGuard can ensure data authenticity using a data integrity micropolicy. This micropolicy prevents the modification of data between digital signature authentication and the AI system, ensuring only trusted and secure data is fed into the system.

In figure 4, we illustrate how this results in security being built into the design rather than being added as an afterthought. In this example, the CoreGuard solution is integrated with an Andes N25 RISC-V host processor. The integration involved identifying the signals needed to inform CoreGuard interlocks about instruction execution: load, store, and data address extracted from data memory access pipeline stage. Next, the integrated SoC design was synthesized for hardware simulation. The result successfully ran a sample application to exercise all CoreGuard interlock paths, thus proving the feasibility of integrating the N25 RISC-V processor with CoreGuard.

Figure 4 Proof of integration feasibility - Andes-Dover
Figure 4. Proof of integration feasibility (N25+CoreGuard).

There are many ways to secure a system on chip.  What we have illustrated in this article is how it’s possible to start securing the CPU by using the inherent features built into the RISC-V architecture and additional extensions. We then showed how adding an extra layer to chip level security can protect data flow and interactions between various devices in the SoC.  Security doesn’t have to be hard, but it does require planning.

 

This article was originally published on Embedded.

 

Marco Ciaffi Dover Microsystems

Marco Ciaffi is co-founder & VP of engineering, Dover Microsystems. Prior to Dover, Marco built and led the team incubating the technology at Draper Labs. He also worked as senior engineering manager at RSA Security where he was responsible for SecurID hardware authentication products. Marco received his bachelor of engineering from Pratt Institute and his masters in computer science from Boston University. He also holds an executive certificate in strategy and innovation from the Sloan School of Management at MIT.

 

John Min Andes Technology

John Min is director of field applications engineering for North America at Andes Technology. John has been working for processor companies in the Silicon Valley for past 30 years, including at Hewlett Packard, LG, Arc, MIPS and SiFive. He has a wealth of information on processor architectures, IP and high-performance processing. John specializes in balancing the power, area and performance to yield optimized SoC. He is a graduate of University of Southern California with degrees in electrical engineering and biomedical engineering.

Virtual Event - PowerUP Asia 2024 is coming (May 21-23, 2024)

Power Semiconductor Innovations Toward Green Goals, Decarbonization and Sustainability

Day 1: GaN and SiC Semiconductors

Day 2: Power Semiconductors in Low- and High-Power Applications

Day 3: Power Semiconductor Packaging Technologies and Renewable Energy

Register to watch 30+ conference speeches and visit booths, download technical whitepapers.

Leave a comment