NVIDIA CUDA Driver Use-After-Free Vulnerability Allows Privilege Escalation in Multi-Tenant GPU Clusters
Overview
A use-after-free vulnerability was identified in the NVIDIA CUDA driver's memory management module for Linux. The flaw, present in the `nv-ioctl.c` component, can be triggered by a specific sequence of memory allocation and deallocation calls from a user-space process. An unprivileged local attacker running an ML workload inside a container can craft a sequence of CUDA API calls that causes the kernel-mode driver to retain a dangling pointer to a freed memory object. By subsequently mapping memory in user space, the attacker can reclaim the physical memory page associated with this pointer and overwrite the kernel object's contents with malicious shellcode. When the driver later attempts to access this object, it triggers the use-after-free condition and executes the attacker's code with kernel-level privileges (Ring 0). This effectively allows an attacker to break out of their container and gain full root access to the host machine. The vulnerability is particularly severe in multi-tenant cloud and on-premise Kubernetes clusters where different users share physical GPUs, as it allows one tenant to compromise the entire node and potentially access the data and models of all other tenants.
Affected Systems
Testing Guide
1. **Check Driver Version:** Run `nvidia-smi` on the host system. The driver version is displayed in the top right corner. Compare this against the fixed versions listed in the NVIDIA security bulletin. 2. **Proof-of-Concept (Caution):** Security researchers have released a proof-of-concept tool. Running this tool within a Docker container with GPU access on a vulnerable host will result in a root shell being spawned on the host node. **Only run this in a dedicated, non-production test environment.**
Mitigation Steps
1. **Update NVIDIA Drivers:** Immediately update all affected host systems to a patched driver version (e.g., 535.161.09 or 550.54.15 or newer). 2. **Restrict GPU Access:** In multi-tenant environments, use mechanisms like Kubernetes admission controllers to restrict GPU access to only trusted workloads and users. 3. **Use GVisor or Kata Containers:** For untrusted ML workloads, consider running them inside sandboxed container runtimes like gVisor or Kata Containers, which provide an additional layer of kernel isolation and can mitigate the impact of kernel-level exploits. 4. **Monitor for Anomalous Activity:** Implement host-based intrusion detection systems (HIDS) to monitor for signs of privilege escalation, such as unexpected processes running as root.
Patch Details
Patched in driver versions 535.161.09 and 550.54.15, released in March 2025.