NVIDIA GPU Driver Kernel Mode Handler Contains Use-After-Free Vulnerability Leading to Privilege Escalation
Overview
NVIDIA released a security bulletin addressing a high-severity vulnerability in the kernel mode layer of its GPU display driver for Windows and Linux. The vulnerability, tracked as CVE-2024-0073, is a use-after-free condition that can be triggered by a specially crafted shader sent to the GPU via standard graphics APIs like CUDA or DirectX. A local, low-privileged user can exploit this flaw to crash the system, resulting in a denial of service. More critically, sophisticated exploitation could lead to arbitrary code execution within the kernel, allowing for a full privilege escalation to SYSTEM or root. This vulnerability is particularly dangerous in multi-tenant AI/ML environments, such as shared JupyterHub deployments or Kubernetes clusters with GPU time-slicing, where a malicious user could exploit it to escape their container and gain control over the underlying host node. This would compromise all other containers on the machine, potentially leading to the theft of sensitive models, training data, and credentials. The vulnerability was discovered through fuzzing of the driver's IOCTL handler by security researchers.
Affected Systems
Testing Guide
1. **Check Driver Version**: On Linux, run `nvidia-smi` to see the installed driver version. On Windows, check the NVIDIA Control Panel under 'System Information'. 2. **Compare with Advisory**: Compare your installed version with the 'Affected Versions' listed in the official NVIDIA security bulletin for CVE-2024-0073. 3. **Confirm Vulnerability**: If your driver version is lower than the 'Updated Version' listed in the bulletin for your respective OS and product type, your system is vulnerable.
Mitigation Steps
1. **Update Drivers**: Immediately update all NVIDIA drivers on affected systems to the versions specified in the security bulletin (e.g., 551.52 for Windows, 550.40.07 for Linux) or newer. 2. **Restrict GPU Access**: In multi-tenant environments, use security mechanisms like SELinux or AppArmor to restrict which users and processes can access GPU device files (`/dev/nvidia*`). 3. **Use Virtualization**: For critical workloads, consider using GPU passthrough to a dedicated virtual machine to isolate the driver from the host kernel and other tenants. 4. **Monitor for Anomalies**: Monitor GPU performance and system logs for unexpected crashes or kernel panics, which could indicate attempts to exploit this vulnerability.
Patch Details
NVIDIA has released updated driver versions (551.52+, 550.40.07+) that address this vulnerability by correcting the memory management within the affected kernel component.