Blob di C.I.R.C.E.

Indirect Prompt Injection Attacks Against LLM Assistants

Really good research on practical attacks against LLM agents. > “Invitation Is All You Need! Promptware Attacks Against LLM-Powered Assistants > in Production Are Practical and Dangerous” > > Abstract: The growing integration of LLMs into applications has introduced new > security risks, notably known as Promptware—maliciously engineered prompts > designed to manipulate LLMs to compromise the CIA triad of these applications. > While prior research warned about a potential shift in the threat landscape > for LLM-powered applications, the risk posed by Promptware is frequently > perceived as low. In this paper, we investigate the risk Promptware poses to > users of Gemini-powered assistants (web application, mobile application, and > Google Assistant). We propose a novel Threat Analysis and Risk Assessment > (TARA) framework to assess Promptware risks for end users. Our analysis > focuses on a new variant of Promptware called Targeted Promptware Attacks, > which leverage indirect prompt injection via common user interactions such as > emails, calendar invitations, and shared documents. We demonstrate 14 attack > scenarios applied against Gemini-powered assistants across five identified > threat classes: Short-term Context Poisoning, Permanent Memory Poisoning, Tool > Misuse, Automatic Agent Invocation, and Automatic App Invocation. These > attacks highlight both digital and physical consequences, including spamming, > phishing, disinformation campaigns, data exfiltration, unapproved user video > streaming, and control of home automation devices. We reveal Promptware’s > potential for on-device lateral movement, escaping the boundaries of the > LLM-powered application, to trigger malicious actions using a device’s > applications. Our TARA reveals that 73% of the analyzed threats pose > High-Critical risk to end users. We discuss mitigations and reassess the risk > (in response to deployed mitigations) and show that the risk could be reduced > significantly to Very Low-Medium. We disclosed our findings to Google, which > deployed dedicated mitigations...

September 3, 2025 / Schneier on Security

Uncategorized

academic papers

LLM

cyberattack

Regulating AI Behavior with a Hypervisor

Interesting research: “Guillotine: Hypervisors for Isolating Malicious AIs.” > Abstract:As AI models become more embedded in critical sectors like finance, > healthcare, and the military, their inscrutable behavior poses ever-greater > risks to society. To mitigate this risk, we propose Guillotine, a hypervisor > architecture for sandboxing powerful AI models—models that, by accident or > malice, can generate existential threats to humanity. Although Guillotine > borrows some well-known virtualization techniques, Guillotine must also > introduce fundamentally new isolation mechanisms to handle the unique threat > model posed by existential-risk AIs. For example, a rogue AI may try to > introspect upon hypervisor software or the underlying hardware substrate to > enable later subversion of that control plane; thus, a Guillotine hypervisor > requires careful co-design of the hypervisor software and the CPUs, RAM, NIC, > and storage devices that support the hypervisor software, to thwart side > channel leakage and more generally eliminate mechanisms for AI to exploit > reflection-based vulnerabilities. Beyond such isolation at the software, > network, and microarchitectural layers, a Guillotine hypervisor must also > provide physical fail-safes more commonly associated with nuclear power > plants, avionic platforms, and other types of mission critical systems. > Physical fail-safes, e.g., involving electromechanical disconnection of > network cables, or the flooding of a datacenter which holds a rogue AI, > provide defense in depth if software, network, and microarchitectural > isolation is compromised and a rogue AI must be temporarily shut down or > permanently destroyed. ...

April 23, 2025 / Schneier on Security

Tag - threat models