Privilege Escalation in NVIDIA GPU Display Driver Due to Improper Input Validation
Overview
NVIDIA released a security update addressing a high-severity vulnerability in its GPU Display Driver for both Windows and Linux. The vulnerability, identified as CVE-2024-0072, resides in the kernel mode driver component (`nvlddmkm.sys` or `nvidia.ko`), which is responsible for managing low-level GPU operations. The flaw is caused by the driver's failure to properly validate input passed from a user-mode application. A local attacker with basic user privileges can run a specially crafted application that sends a malicious sequence of API calls and parameters to the driver. This triggers an out-of-bounds write condition in kernel memory. Successful exploitation allows the attacker to corrupt kernel data structures, leading to either a denial of service (system crash) or, more critically, execution of arbitrary code with kernel-level (SYSTEM/root) privileges. This poses a significant risk in multi-tenant AI/ML environments, such as shared JupyterHub instances or virtualized GPU clusters, where a compromised tenant could potentially escape their container and take full control of the host machine, gaining access to other tenants' data and computational resources. The vulnerability affects a wide range of consumer (GeForce) and enterprise (Tesla, Quadro) GPU drivers.
Affected Systems
Testing Guide
1. **Check Driver Version (Linux)**: Run the command `nvidia-smi` and check the 'Driver Version' in the output. 2. **Check Driver Version (Windows)**: Open the NVIDIA Control Panel, go to 'Help' -> 'System Information', and note the driver version. 3. **Compare with Bulletin**: Compare your installed driver version with the 'Affected Versions' and 'Patched Versions' listed in the official NVIDIA security bulletin for CVE-2024-0072. If your version is lower than the patched version for your product line, you are vulnerable.
Mitigation Steps
1. **Update Drivers**: Immediately update all NVIDIA drivers to the patched versions listed in the official NVIDIA Security Bulletin. 2. **Restrict User Access**: In shared environments, limit direct access to the host OS. Ensure users are properly isolated using technologies like containers or virtual machines. 3. **Use Secure Virtualization**: For multi-tenant GPU workloads, employ virtualization technologies with strong IOMMU-based isolation (e.g., SR-IOV) to mitigate the risk of host compromise from a guest environment. 4. **Monitor System Logs**: Monitor for unexpected system crashes or kernel panics, which could indicate attempted exploitation.
Patch Details
Patches are available in NVIDIA driver versions 551.52 (Windows) and 550.54.14 (Linux) and later.