Privilege Escalation via Out-of-Bounds Write in NVIDIA GPU Display Driver for Linux
Overview
Security researchers at Project Zero disclosed a high-severity vulnerability in NVIDIA's GPU Display Driver for Linux. The flaw, identified as an out-of-bounds write in the kernel mode layer, can be triggered by a user-mode client sending a specially crafted sequence of commands to the driver through its IOCTL interface. An attacker with local, low-privileged access to a system with an affected driver can exploit this vulnerability to write arbitrary data to kernel memory. This can lead to a denial-of-service (DoS) condition by crashing the kernel, or more critically, it can be leveraged to achieve arbitrary code execution with kernel-level privileges. In a multi-tenant cloud environment where GPU resources are shared, this vulnerability is particularly dangerous, as it could allow a malicious tenant to escape their container or virtual machine and gain control over the underlying host, compromising all other tenants. The vulnerability stems from insufficient input validation on parameters controlling memory copy operations within the driver's memory management unit. NVIDIA was notified through a coordinated disclosure process and has since released patched driver versions to address this critical infrastructure risk.
Affected Systems
Testing Guide
1. **Check Driver Version**: Run the `nvidia-smi` command on the Linux host. 2. **Examine Output**: In the top right of the output table, locate the `Driver Version`. 3. **Compare Versions**: If the reported version is less than `535.129.03`, the system is considered vulnerable and should be patched immediately.
Mitigation Steps
1. **Update Drivers**: Immediately update NVIDIA drivers to the latest version recommended by the vendor (e.g., 535.129.03 or newer). 2. **Restrict Driver Access**: In multi-tenant environments, use security mechanisms like seccomp-bpf to restrict the system calls that containers can make to the GPU driver. 3. **Use Secure Runtimes**: Employ container runtimes with stronger isolation, such as gVisor or Kata Containers, to limit the impact of a kernel-level exploit. 4. **Monitor Kernel Logs**: Regularly monitor kernel logs for unexpected errors or crashes related to the NVIDIA driver, which could indicate exploitation attempts.
Patch Details
Update to NVIDIA driver version 535.129.03 or later. Patches are available via standard distribution channels and from the NVIDIA website.