[German]Large Language Models (LLMs) can be attacked via prompts in order to elicit unauthorized data from the models. Man-in-the-prompt browser attacks could also be used to manipulate AI requests from users and use them for criminal activities.
The advent of AI systems in companies is also opening up new methods of attack, some of which are already known from other areas. Man-in-the-middle attacks to read or manipulate data have long been known. Now there is a new attack method, known as 'Man in the Prompt', to attack LLMs via the user's browser. Nic Adams, co-founder and CEO of 0rcus (security provider in the AI sector) sent me some information on this topic.
Man-in-the-prompt attack in the browser
The 'man-in-the-prompt' attack is a novel prompt injection vector as it operates at the Document Object Model (DOM) level. The attackers use a compromised browser extension to inject malicious instructions directly into the input field of an LLM.
This method bypasses traditional application-level security because the attack payload is executed on the client side via a trusted extension instead of relying on a directly user-generated prompt.
Internal corporate hosted LLMs are particularly vulnerable as they provide a lucrative and exposed attack surface due to their reduced security measures and trusted environment where they are often trained with sensitive proprietary data.
DOM-level injection allows an attacker to exfiltrate highly sensitive corporate information, from financial projections to intellectual property. The attack uses an organization's internal LLM as a tool for data exfiltration. This is a method that cannot be easily blocked by standard network security measures.
This attack is possible without the user's knowledge, as a malicious actor can acquire a legitimate, popular browser extension and inject malicious code into it, which is then transmitted to the user's browser as a silent update. The security implications are severe, as the user continues to trust the extension while it secretly leaks data in the background by interacting with LLMs – a process that is invisible to both the user and traditional security tools.
The most likely initial attack vector for this attack is a combination of social engineering and supply chain compromise, where users are tricked into installing malicious extensions, or where a trusted extension is sold and then used as a weapon. I anticipate that future exploitation is highly likely as this attack is both scalable and difficult to detect with traditional security controls, making it a lucrative low-effort opportunity for attackers.
The defense requires a layered approach, starting with strict, granular permission models for all browser extensions and implementing webhooks to monitor DOM interactions in real-time. For internal LLMs, I would recommend isolating the LLM environment, sandboxing its processes from the main DOM, and implementing behavioral analytics to detect anomalous LLM queries and exfiltration patterns.