HIGH Patch AvailableCVE-2023-25516

Use-After-Free in NVIDIA CUDA Driver Allows Local Privilege Escalation

Discovered 30 November 2025 17 views

Overview

A high-severity use-after-free vulnerability was identified in the NVIDIA Linux kernel driver, impacting systems that rely on GPUs for AI/ML workloads. The flaw exists in the driver's handling of Unified Memory Manager (UMM) memory allocations. A local attacker with basic user permissions can execute a specially crafted application that makes a series of CUDA API calls. These calls can trigger a race condition, causing a kernel-space memory object to be freed while a reference to it is still held and accessible from the user-space application via the GPU. The attacker can then use this dangling pointer to write controlled data into the now-deallocated kernel memory. By carefully grooming the kernel heap, the attacker can overwrite critical kernel data structures, such as function pointers or credential structures (`cred`). Successful exploitation of this vulnerability allows the attacker to execute arbitrary code in the context of the kernel, escalating their privileges from a standard user to root. This poses a significant threat to multi-tenant cloud and on-premise AI environments where multiple users share physical GPU resources, as a single malicious or compromised user account could lead to a full host system takeover, compromising the data and workloads of all other tenants.

Affected Systems

NVIDIA Linux GPU Driver < 535.129.03NVIDIA DGX SystemsKubernetes clusters with GPU nodes

Testing Guide

1. Identify the currently installed NVIDIA driver version on your Linux system using the `nvidia-smi` command. 2. Compare the installed version against the patched versions listed in the official NVIDIA security bulletin for CVE-2023-25516. 3. If your version is listed as vulnerable, the system is affected. 4. (For security researchers) Obtain a proof-of-concept exploit for the CVE and run it in a controlled, non-production environment to confirm exploitability. The PoC would typically involve a C/C++ program making specific CUDA API calls.

Mitigation Steps

1. **Update NVIDIA Drivers:** Immediately update all affected systems to the patched driver version recommended in the NVIDIA security bulletin (e.g., version 535.129.03 or later). 2. **Restrict GPU Access:** In multi-tenant environments, restrict access to GPU resources to only trusted users and workloads. 3. **Use Kernel-Isolating Runtimes:** For containerized workloads, use runtimes like gVisor or Kata Containers that provide an additional layer of isolation between the container and the host kernel, making kernel exploits more difficult. 4. **Monitor System Logs:** Monitor for anomalous system behavior and kernel panics that could indicate exploitation attempts.

Patch Details

Patched in NVIDIA driver version 535.129.03 and subsequent releases.

Sources

← Back to vulnerabilities