Path Traversal in NVIDIA Triton Inference Server Allows Unauthorized Model Overwrite
Overview
A high-severity path traversal vulnerability was discovered in NVIDIA's Triton Inference Server, a critical infrastructure component for deploying and serving AI models at scale. The flaw existed in the server's API endpoint for managing model repositories. An unauthenticated attacker with network access to the Triton management port could send a specially crafted API request to load or unload a model. By embedding '..%2F' (URL-encoded dot-dot-slash) sequences in the model name parameter of the request, the attacker could break out of the intended model repository directory and access arbitrary locations on the server's filesystem. This vulnerability had a critical impact, allowing an attacker to read sensitive files, such as server configuration files, SSH keys, or even other tenants' model data in a shared environment. More dangerously, the same vulnerability could be used to write or overwrite files, enabling an attacker to deface web content served by a co-located server, delete critical system files to cause a denial-of-service, or overwrite a legitimate model's weights with a malicious, backdoored version. The discovery affects a core piece of the MLOps pipeline, demonstrating that the infrastructure supporting AI models is as critical a security concern as the models themselves.
Affected Systems
Testing Guide
1. **Check Version:** Determine the version of the Triton container you are running using `docker inspect <container_id>` and checking the image tag. If it is older than `nvcr.io/nvidia/tritonserver:24.01-py3`, you are likely vulnerable. 2. **API Test (Use with caution):** From a machine with access to the Triton API, send a request attempting to load a model with a traversal path. For example: `curl -v -X POST localhost:8001/v2/repository/models/..%2F..%2F..%2Fetc%2Fpasswd/load`. A vulnerable server may return an error indicating it tried and failed to load a file at that path, while a patched server will reject the model name as invalid.
Mitigation Steps
1. **Upgrade Triton:** Immediately upgrade to NVIDIA Triton Inference Server container version `24.01` or newer. 2. **Network Segmentation:** Restrict access to the Triton Inference Server's management port (typically 8000, 8001, 8002). It should only be accessible from trusted internal networks and never exposed to the public internet. 3. **Use Non-Root User:** Run the Triton container as a non-root user to limit the potential impact of a successful file write exploit. 4. **Read-Only Mounts:** If possible, mount the model repository and other critical directories as read-only to prevent unauthorized modification.
Patch Details
The vulnerability is patched in NVIDIA Triton Inference Server container version 24.01 and all subsequent releases.