Cross-Tenant Data Leakage in Azure OpenAI Service via Shared Resource Caching
Overview
A significant data exposure incident was reported in the Azure OpenAI Service, stemming from an improper caching mechanism on shared backend infrastructure. The vulnerability allowed for a brief period where prompts and responses from one customer's session could be inadvertently exposed to another customer. The issue was traced to a bug in the load balancing and session management layer that handled requests to the underlying GPU clusters. Under high load, when a resource slot was de-allocated from one tenant and immediately re-allocated to another, a small amount of data from the previous session's memory cache (approximately 1-2 KB) was not properly zeroed out. This resulted in data from Tenant A's prompt or completion being prepended to the response received by Tenant B. While the window for exposure was small, it was non-zero. Microsoft's internal monitoring detected the anomaly and a post-mortem analysis revealed that a small number of customers had their data, including PII and proprietary source code, exposed to other tenants. This incident underscored the immense security challenges of building secure, multi-tenant AI services on shared infrastructure.
Affected Systems
Testing Guide
1. This vulnerability was a transient, server-side issue and cannot be actively tested by customers. 2. Verification of mitigation relies on official attestations and security reports from Microsoft. 3. To test the principle, one would need access to the service's internal infrastructure to analyze memory allocation and de-allocation patterns between tenant requests under load.
Mitigation Steps
1. No direct user action is required as Microsoft has patched the backend infrastructure. 2. Customers who handle highly sensitive data should opt for 'Provisioned Throughput Units' or dedicated instances where available to ensure resource isolation. 3. Review Azure activity logs and audit trails for any anomalous API responses during the specified incident window (2025-03-10 to 2025-03-12). 4. Implement client-side data loss prevention (DLP) to scan LLM responses for unexpected or out-of-context information before it is processed by internal applications.
Patch Details
Microsoft deployed a hotfix to their backend infrastructure on 2025-03-12, which enforces a cryptographic zeroing of memory caches upon resource de-allocation. Affected customers were notified via the Azure Service Health dashboard.