NVIDIA Triton Inference Server Path Traversal Enables Remote Code Execution
Overview
A path traversal vulnerability in the NVIDIA Triton Inference Server allows a remote, unauthenticated attacker to achieve arbitrary code execution. The flaw exists in the server's model loading mechanism, which fails to properly sanitize the file path provided in a model repository index. An attacker can craft a malicious model configuration file with directory traversal sequences (e.g., `../../..`) in the library path. When the server is instructed to load this model from an attacker-controlled repository, it will load a shared library from an arbitrary location on the server's filesystem. By combining this with the ability to upload files (or leveraging an existing file), an attacker can trick the Triton server into loading a malicious shared object file (`.so`), which will then be executed with the permissions of the Triton server process. This provides a direct path to RCE on the underlying GPU-enabled infrastructure. The impact is critical, as it allows for complete compromise of the machine learning inference server, potential exfiltration of sensitive models and data, and a pivot point for further attacks into the MLOps environment.
Affected Systems
Testing Guide
1. Set up a model repository accessible by the Triton server. 2. Create a model configuration file (`config.pbtxt`) containing a path traversal payload in the `platform` or library path field, pointing to a known shared library on the server (e.g., `/usr/lib/x86_64-linux-gnu/libc.so.6`). 3. Send an API request to the server to load the malicious model from the repository. 4. Monitor the server logs for errors or unexpected behavior related to loading a non-model library. A successful test would involve loading a custom-crafted malicious `.so` file that performs an action like writing a file to `/tmp`.
Mitigation Steps
1. **Upgrade Triton Server:** Update to version 23.05 or later, where the path sanitization has been fixed. 2. **Restrict Model Repository Access:** Ensure that only trusted administrators can add or modify models in the repositories that Triton loads from. Do not allow Triton to load models from untrusted or user-writable locations. 3. **Use Minimal Base Images:** Run Triton in a minimal, hardened container image with a reduced set of available libraries to limit the attacker's capabilities post-exploitation. 4. **Implement Filesystem Monitoring:** Use security tools to monitor the Triton server's filesystem for unexpected file writes or the loading of unusual shared libraries.
Patch Details
Patched in NVIDIA Triton Inference Server version 23.05 and all subsequent releases.