Why It’s So Hard To Secure AI Chips

13 min Read
Semiconductor Engineering

Much of the hardware is the same, but AI systems have unique vulnerabilities that require novel defense strategies.

Demand for high-performance chips designed specifically for AI applications is spiking, driven by massive interest in generative AI at the edge and in the data center, but the rapid growth in this sector also is raising concerns about the security of these devices and the data they process.

Generative AI — whether it’s OpenAI’s ChatGPT, Anthropic’s Claude, or xAI’s Grok — sifts through a mountain of data, some of which is private, and runs that data on highly valuable IP that companies in the field have a vital interest in keeping secret. That extends from the hardware, which can be corrupted or manipulated, to the huge volumes of data needed to train the models, which can be “poisoned” to achieve unexpected results. And while all of this may be an inconvenience in a smart phone, it can have devastating consequences in safety-critical applications such as automotive and industrial, or in mission-critical applications such as financial transactions.

Fig. 1: Taxonomy of attacks on predictive AI systems. Source: NIST

Given the rapid growth of this sector, it’s essential to address security concerns from the outset. Deloitte estimates the global market for generative AI chips will reach $50 billion in 2024, but the firm notes that sales could reach as high as $400 billion in just three years’ time. How long this run continues is anyone’s guess. Regional regulation of AI, geopolitics, and other non-chip-specific threats may have an impact. But at least for now AI appears to be unstoppable, and security experts are sounding alarms about possible threats at multiple levels.

From a chip standpoint, the good news is that AI chips are no different to other types of chips. But there are a lot of them, and they are often designed for maximum processing speed, low latency, and in edge devices, low power.

“There’s a range of [AI] chips, but the big ones are very performance-based,” said Mike Borza, principal security technologist at Synopsys. “That means that anything we do to increase security or to provide confidentiality and integrity protection of the model and the memory around the AI chip needs to be at very high-performance levels, so it’s a technical challenge in that respect. That challenge comes with area impositions and power implications, as well. There’s some latency associated with it just because it takes time to do that work. That’s the kind of the framework that you operate in, but that’s the framework of every chip that you’re always concerned about.”

What makes a generative AI chip a particularly attractive target for cyberattacks is the amount of data it processes.

“There really is not a unique security concern for AI chips, but because of the additional compute power and lower power consumption profiles they are created under, they are more attractive to cryptocurrency or crypto-mining specific types of use cases that will be very processor-intensive,” said Jim Montgomery, principal solution architect at TXOne. “I can see an attraction there, but that’s not a physical attribute of the chip.”

Increasingly, however, these chips are targets for attacks, and defending them requires a multi-pronged security strategy. Only one part of that, according to Lee Harrison, director of automotive test solutions at Siemens EDA, “is what security is there to protect access to these AI capabilities, which doesn’t necessarily mean there’s nothing protecting against the AI doing stupid stuff. That’s just protection against bad actors getting access to the AI technology, and both are as important as the other, because if you get access to the AI technology, somebody could take use of that AI technology for their own benefit. Or, you can actually be more malicious and get the AI technology to do things that it’s not supposed to. It depends on how much of a bad actor you are.”

Securing IP

AI components are also attractive for attackers because of the incredibly high value placed on the algorithms that power the models, said Scott Best, a senior principal engineer at Rambus.

“Why they need to protect it is not just for the usual reasons — that you don’t want malicious actors causing your chip to malfunction in some insecure way,” Best said. “You don’t want them to extract any secrets, you don’t want them to reverse engineer your firmware or any of your trade secrets that are in the chip. If your adversary can get the output of your training set, those 5 million 16-bit weights, they can install that in their own processor and immediately have your functionality. So you do have some very determined adversaries that are like, ‘Look, we’re not going to spend $3.5 million of our own on a 70,000 GPU compute farm. We’re going to just steal what Tesla did.”

The chips are also slightly more vulnerable due to the requirement of incredibly low latency for them to function at effectively as possible.

“The big thing is that bandwidth to and from memory, and latency to and from memory, are two of the key performance metrics by which AI systems are judged, because it affects the rate at which you can detect and respond to whatever the stimulus is,” said Borza. “The minimum latency is a few clock cycles at about hundreds of megahertz to low gigahertz rates. That doesn’t sound like very much time, but when you’re doing a lot of memory accesses, if you have these delays on the way in and the way out, that starts to cost you a lot. Then, once you start structuring data into blocks of data, you get to be concerned about how much data you actually have to move in order to access a single byte of data or a single nibble of data.”

AI networks have another unique vulnerability. Because they require incredible amounts of data for training, those looking to mount an attack may be tempted to do so by slipping bad data into devices.

“It’s quite scary as to what you can actually achieve,” said Harrison. “A simple example is if you’re using an ADAS system in a vehicle, how do you give the AI system within that vehicle some kind of phantom information about things like speed limits on roads. So there’s a lot of research going on into how easy it is to feed that phantom data into these AI devices.”

While poisoning AI with bad data is widely acknowledged as a possible problem within engineering circles, TXOne’s Montgomery also warned of less obvious attack routes, including a number of security capabilities that may be overlooked. “One of them is firmware, and the utilization of open-source coding within the firmware, because that’s not validated or reviewed,” Montgomery said. “It’s open source. It’s the Wild West. So one area of concern I have is the utilization of open-source code when it comes to firmware on these chips, because you open up the possibility for exposure, the possibility for compromise on the firmware itself.”

Harrison agreed that the firmware could be a vulnerability, though he questioned whether it’s a hardware design issue. “A lot of the hardware is actually built to target that firmware, which is obscured,” he said. “For AI chip manufacturers’ products to be successful, they have to be able to work with that firmware. It’s like building a PC system that has to support Linux. If it doesn’t support Linux, you’re not going to sell that many of them. So to be able to adopt the industry standard AI firmware, you need to be able to do that. How you protect against that is more of a software question.”

Montgomery also voiced some concern over the integrity of supply chains for chip components. Although the CHIPS Act is aimed at bolstering American independence in semiconductor manufacturing, the niche nature of some of the components may make this a problem that can never be totally solved. “Based on my exposure to the industry and what’s happening in the industry on sourcing production for all types of microelectronics or integrated circuits, I’m not sure how possible that is, just because of the limited number of manufacturers or limited number of people that are actually producing this specific thing. There may be some limited availability, as far as who’s producing it, and then where you can actually source that from. Again, I’m not sure we’ll ever completely alleviate that.”

Partial solutions

Fortunately, much thought has been given in how to solve these security issues. Borza noted that several strategies already are being implemented in AI-focused data centers, where they are combined with strategies such as confidential computing and separation of compute spaces. Still, he expressed concerns that at least one of the new techniques may not go far enough to silo off data from a possible attack.

“There are things like inline memory encryption units now, which provide encryption and usually integrity protection, as well,” Borza said. “Encryption without integrity protection is generally considered a risk by security experts, but I understand that people do it because of the costs associated with integrity protection. The standard kind of scheme, in that case, puts the encryption unit closely coupled with the memory controllers. So you can designate regions of memory that are going to be encrypted, and that happens as the data flows through the system.”

The second option involves “lookaside” engines that the data is structured to and from, between the caches and the main memory.

“The memory goes through the encryption unit, one block at a time or several blocks at a time, as it’s being copied into or out of the system bus. At that point, you’re potentially dealing with a few hundred bytes of data at a time,” Borza explained.

Less obvious is the impact of aging on security, and with chips designed specifically for AI, this potentially is a much bigger problem than in the past because the chips are so focused around performance. In data centers, this requires what is being called strategic processor lifecycle management to safeguard data.

“Aging processors frequently struggle to handle the latest security protocols and may suffer from delays in receiving crucial updates, heightening vulnerability to threats, opening the door to potential cyberattacks, and possibly leading to compliance discrepancies with current hardware regulations,” said Phil Steffora, chief security officer for Arkose Labs. “It’s therefore imperative that CSPs, data center managers, and chip manufacturers work closely together to proactively pinpoint and address potential vulnerabilities, thereby maintaining operational excellence and security.”

AI security is required at the edge, as well, particularly at the sensor level, which is the source of much of the data that needs to be processed.

“We’ve created six essentials for AI edge security, where we go into the general hygiene of the device from a security perspective, the model confidentiality, the model integrity, where you are now able to change parameters and change the inputs of the data,” said Erik Wood, director secure MCU product line at Infineon. “We have the extra confidentiality of the models themselves — and keys, more importantly — and then the sensor integrity is something that we’re now talking about, as well. We can go in and directly change parameters to fake out the model, or we can go to the extended piece where the data is being fed from and attack that.”

What happens at the edge is critical to securing AI models from those seeking to steal them for their own use. Rambus’ Best recalled a speaker at a recent hardware conference who cited “some insane numbers” in the amount of GPU hours it can take to train a model.

“Turns out it’s really similar to FPGA bit files, which I know it sounds like a record scratch when I said it, but a lot of companies out there generate a bit file that has years and years of engineering in it, and it implements some functionality that is programmed into the programmable user fabric of an FPGA,” he said. “That’s like standard, off-the-shelf Xilinx, and now that standard, off-the-shelf Xilinx becomes whatever this function was going to be. Your adversary, your opponent in this space, your competitor, would very much like to reverse engineer the bit file and see exactly what you’re doing. FPGA companies have done Herculean work to protect that bit file both from a data privacy point of view and a data authenticity point of view, and they use very clever cryptography to ensure both data privacy and data authenticity of that bit file.”


The AI sector is growing at incredible speed, with billions of dollars of venture capital being invested into an ever-increasing number of companies looking to either develop new models or incorporate existing ones in novel ways. That has created huge demand for compute but also a danger of not just personal or corporate data being stolen, but the very algorithms on which this industry relies.

While the hardware being used to secure the high performance chips required to supply this compute doesn’t necessarily differ much from that used in other parts of the semiconductor industry, those tasked with securing all this data have begun deploying them in novel ways, even as they have begun to develop entirely new components and software designed especially for AI.

Nevertheless, the future may hold some new developments. Because AI chips demand such high performance, new security strategies are being examined that will keep all that information safe. Best said Rambus is examining a method of including “some randomizations into the actual AI model execution itself, so that it was actually performing the correct mathematical model, but at the same time also performing a lot of incorrect models. This creates both noise and counterweight signals so that an adversary who was just trying to reverse engineer from listening to the power supply would have a much more difficult time.”

Yet even these new techniques will come with tradeoffs, a challenge that will remain even as compute continues to increase. “It’s not inexpensive,” Best noted. “It’s random masking. It’s shuffling the actual execution orders and shuffling and masking. It takes time, it takes power, it reduces performance.”

Read the original article here.

Share Now