Server-Side Request Forgery (SSRF) in Azure OpenAI 'On Your Data' Feature
Overview
A high-severity Server-Side Request Forgery (SSRF) vulnerability was discovered in the 'On Your Data' feature of the Azure OpenAI Service. This feature allows users to connect their language models to various data sources, including Azure Blob Storage, to enable Retrieval-Augmented Generation (RAG) over private data. The vulnerability arose from insufficient validation of user-provided URLs during the data source configuration process. An attacker with permissions to configure a data source could provide a specially crafted URL pointing to internal, non-public IP addresses within the Azure fabric. When the service's backend data ingestion component attempted to connect to this URL to fetch data, it would instead send a request to an internal endpoint. This allowed attackers to perform network reconnaissance on Azure's internal infrastructure, probe for open ports on other services, and, most critically, access the Azure Instance Metadata Service (IMDS). By crafting a request to the IMDS endpoint (`169.254.169.254`), an attacker could potentially retrieve temporary service credentials and other sensitive metadata associated with the underlying virtual machine hosting the OpenAI service. This could lead to a cross-tenant data breach or further pivot within the Azure environment. The issue was not in the LLM itself but in the surrounding cloud service architecture that processed user-controlled data URIs. The incident serves as a critical reminder that integrating LLMs with external data sources introduces classic web application vulnerabilities like SSRF into the AI service plane.
Affected Systems
Testing Guide
1. Within the Azure OpenAI Studio, navigate to the 'Add your data' feature. 2. When configuring a data source (e.g., 'From URL'), attempt to provide a URL that points to an internal-only IP address or a server you control (e.g., using a service like Burp Collaborator or Interactsh). 3. Provide a known internal endpoint like `http://169.254.169.254/metadata/instance?api-version=2021-02-01`. 4. Finalize the data source setup and observe if the service attempts to connect. If your collaborator server receives a request, or if the service successfully parses metadata (which would likely result in an error, but a specific one), the endpoint is vulnerable to SSRF.
Mitigation Steps
1. **Use Private Endpoints**: When connecting Azure AI services to data sources, always use Azure Private Endpoints to ensure traffic does not traverse the public internet and is restricted to your virtual network. 2. **Apply Network Policies**: Implement strict egress filtering (Network Security Groups or Azure Firewall) on the subnets hosting AI services to deny all outbound traffic by default and only allow connections to known, trusted endpoints. 3. **Use Managed Identities**: Prefer identity-based authentication (Managed Identities) for accessing data sources over credential or URL-based methods, as this removes the need for the service to handle potentially malicious URLs. 4. **Audit Configurations**: Regularly audit data source configurations in your AI services to ensure they point to legitimate, expected locations and do not contain suspicious-looking URLs or IP addresses.
Patch Details
Microsoft patched the service's backend to implement stricter URL validation, an allowlist for resolvable domains, and enhanced network isolation that prevents the data ingestion service from reaching internal metadata endpoints.