Cross-Tenant Credential Theft in Azure AI Machine Learning via SSRF
Overview
A critical vulnerability was discovered within the Azure AI Machine Learning service that allowed an attacker to achieve cross-tenant privilege escalation. The vulnerability stemmed from a Server-Side Request Forgery (SSRF) flaw combined with inadequate network isolation in the backend infrastructure powering the managed Jupyter compute instances. An attacker, after gaining low-level access to their own compute instance, could exploit the SSRF to send crafted network requests to an internal Azure service endpoint. Due to weak network policies, this internal service failed to properly validate the source of the requests, allowing the attacker to query the metadata service on behalf of other tenants' compute instances running on the same underlying infrastructure. By iterating through internal IP addresses, the attacker could locate and access the Azure Instance Metadata Service (IMDS) endpoint for a victim's instance. From there, they could request and steal the victim's managed identity token. This token, acting as a powerful credential, granted the attacker the same permissions as the victim's compute instance, often including access to sensitive data in Azure Blob Storage, Azure Key Vault secrets, and the ability to manipulate ML models and training jobs, leading to a full compromise of the victim's AI assets within that Azure subscription.
Affected Systems
Testing Guide
1. **Check for Security Advisories**: Review the Microsoft Security Response Center (MSRC) for any advisories related to your Azure AI/ML services and ensure all recommended actions have been taken. 2. **Audit Network Configuration**: In the Azure Portal, navigate to your Machine Learning workspace's networking settings. Verify that it is configured to use a private endpoint and that the associated VNet has restrictive NSG rules applied. 3. **Run Penetration Tests**: For high-security environments, conduct authorized penetration tests against your cloud AI infrastructure to proactively identify misconfigurations or vulnerabilities like SSRF.
Mitigation Steps
1. **Apply Microsoft Security Updates**: Ensure all Azure services are kept up-to-date. Microsoft has patched the backend infrastructure to fix the underlying SSRF and improve network isolation. 2. **Use Azure Private Link**: Configure Azure Machine Learning workspaces with a private endpoint to ensure that communication with the workspace and compute resources occurs over a private network, not the public internet. 3. **Enforce Stricter Network Security Groups (NSGs)**: Apply NSGs to the virtual network used by compute instances to restrict all unnecessary outbound traffic, limiting the potential for SSRF attacks. 4. **Regularly Rotate Credentials**: Although the managed identity token was the primary target, regularly rotating all other credentials (such as storage account keys) can limit the impact of a breach.
Patch Details
Microsoft patched the vulnerability on their backend infrastructure. No direct user action is required for the fix itself, but configuring private endpoints is strongly recommended.