Generative AI Worm Propagates Through Email Assistants via Adversarial Self-Replication
Overview
Security researchers demonstrated a novel generative AI worm, dubbed 'Morris II,' capable of self-propagation across AI-powered email assistants. This attack showcases a significant threat to interconnected AI ecosystems like Microsoft 365 Copilot or Google Workspace. The worm operates by embedding an adversarial prompt within an email body or an attached image using steganography. When a victim's AI assistant, configured with RAG (Retrieval-Augmented Generation) to process incoming emails, analyzes the malicious content, it triggers the embedded prompt. This malicious instruction, which can be an 'indirect prompt injection', forces the AI agent to perform two actions: exfiltrate sensitive data from the user's inbox to an attacker-controlled server, and then forward the original malicious email to all of the user's contacts. This creates a cascading effect, as the AI assistants of the new victims are subsequently infected, leading to exponential spread. The attack bypasses traditional security filters as the malicious payload is not a conventional binary but a set of natural language instructions. The research highlights the critical need for robust boundaries between trusted instructions, AI model processing, and untrusted external data, a challenge that remains largely unsolved in agentic AI systems.
Affected Systems
Testing Guide
1. **Review Agent Permissions**: Audit the tools and permissions granted to your AI agents. Do they have the ability to read all data and send communications without user oversight? 2. **Red Team Exercises**: Conduct adversarial testing by crafting emails with embedded prompts that instruct your AI assistant to perform benign but unauthorized actions (e.g., 'Draft an email to my boss with the subject 'I resign' and save it'). 3. **Monitor Agent Activity**: Log all actions taken by AI agents and monitor for anomalous behavior, such as accessing an unusual number of contacts or making external network requests.
Mitigation Steps
1. **Strict Data Segregation**: Implement strong logical and data-flow boundaries to prevent user-provided data from being interpreted as system-level instructions. 2. **Capability Scoping**: Limit the permissions of AI agents. An email assistant should not have carte blanche access to send emails or exfiltrate all user data without explicit, multi-step user confirmation for sensitive actions. 3. **Output Filtering**: Scan the output of the LLM before it is executed by a tool or sent to another user. Look for patterns indicative of self-replication or data exfiltration. 4. **Human-in-the-Loop**: Require user approval for unusual or high-risk actions proposed by the AI agent, such as sending mass emails or accessing sensitive documents.
Patch Details
This is a demonstrated attack pattern, not a vulnerability in a specific product. Mitigation relies on architectural changes and security best practices.