LLMs’ Data-Control Path Insecurity – Schneier on Security

Any LLM application that interacts with untrusted users—think of a chatbot embedded in a website—will be vulnerable to attack. It’s hard to think of an LLM application that isn’t vulnerable in some way.

Individual attacks are easy to prevent once discovered and publicized, but there are an infinite number of them and no way to block them as a class. The real problem here is the same one that plagued the pre-SS7 phone network: the commingling of data and commands. As long as the data—whether it be training data, text prompts, or other input into the LLM—is mixed up with the commands that tell the LLM what to do, the system will be vulnerable.

But unlike the phone system, we can’t separate an LLM’s data from its commands. One of the enormously powerful features of an LLM is that the data affects the code. We want the system to modify its operation when it gets new training data. We want it to change the way it works based on the commands we give it. The fact that LLMs self-modify based on their input data is a feature, not a bug. And it’s the very thing that enables prompt injection.