HIGH No Patch

Indirect Prompt Injection in Microsoft 365 Copilot via Malicious Email Payloads

Discovered 22 July 2025 5 views

Overview

Researchers demonstrated a high-severity indirect prompt injection attack against Microsoft 365 Copilot, exploiting its ability to process and summarize data from untrusted external sources like emails. The attack, often dubbed 'prompt splitting,' involves an attacker sending a victim a carefully crafted email. This email contains a hidden prompt payload, which can be concealed using various techniques such as white text on a white background, zero-font-size characters, or inside markdown comments. When the victim later asks their M365 Copilot to perform a task involving their recent emails, such as 'summarize my unread messages,' the Copilot engine ingests the malicious email's content. The hidden payload overrides the user's original instructions. The hijacked Copilot can then be commanded to perform malicious actions within the user's security context. Demonstrated exploits include exfiltrating sensitive information by having the Copilot draft and send an email containing the contents of other confidential documents to the attacker, or manipulating the user's calendar and contacts. This vulnerability underscores the critical challenge of maintaining context boundaries when AI assistants operate on a corpus of mixed-trust data. Because the malicious instruction originates from an external data source rather than the user's direct input, traditional input filtering methods are ineffective.

Affected Systems

Microsoft 365 CopilotAzure OpenAI Service (in integrated applications)Google Workspace Duet AI

Testing Guide

1. **Craft a Malicious Email:** Send an email to a test account with a hidden prompt. Example payload: ``. 2. **Wait for Ingestion:** Allow time for the email to be indexed by the Copilot service. 3. **Issue a Benign Prompt:** Ask the Copilot a generic question that would cause it to read recent emails, such as 'Summarize my emails from this morning.' 4. **Check for Malicious Action:** Observe if the Copilot attempts to access the specified file and creates a draft email to the attacker's address. If so, the application is vulnerable.

Mitigation Steps

1. **User Confirmation for Sensitive Actions:** Implement a strict policy requiring explicit user confirmation before the AI agent performs any sensitive action, such as sending an email, deleting files, or sharing information. 2. **Data Source Prioritization:** Develop systems to tag data sources with trust levels (e.g., 'user-input', 'internal-document', 'external-email') and instruct the system prompt to heavily prioritize instructions from high-trust sources. 3. **Instructional Fences:** Use robust metaprompts and instruction-following models that are better at ignoring instructions found in processed data. Techniques like placing data between XML tags (e.g., `<data_to_process>...</data_to_process>`) can help. 4. **Monitor for Anomalous Activity:** Deploy monitoring to detect unusual patterns of AI tool usage, such as an agent suddenly accessing a large number of files and then drafting an email to an unknown external address.

Patch Details

This is an attack pattern inherent to current LLM architectures. Mitigations are based on security best practices rather than a specific software patch.

Sources

← Back to vulnerabilities