Command Injection in NVIDIA DGX BMC Allows Root Privilege Escalation
Overview
A critical vulnerability was identified in the Baseboard Management Controller (BMC) firmware of NVIDIA DGX systems, including the widely used A100 and H100 models. The BMC is a privileged microcontroller that provides out-of-band management, allowing administrators to control the server's hardware, power, and console access. The vulnerability, a command injection flaw in a specific web API endpoint, allows an authenticated user with low-privilege credentials to inject and execute arbitrary OS commands. These commands are executed with root privileges on the BMC's underlying Linux operating system. Successful exploitation grants an attacker complete control over the DGX node at the hardware level. From this position, an attacker could power cycle the machine, intercept console data, tamper with firmware, or—most critically—pivot from the isolated management network to the primary data network. This could potentially allow them to access or exfiltrate the highly sensitive data being processed by the GPUs, such as proprietary training data or the weights of a foundational model. The flaw underscores the importance of securing all components of the AI infrastructure stack, not just the software layers, as a compromise of the hardware management plane can bypass higher-level security controls.
Affected Systems
Testing Guide
1. Identify the current BMC firmware version on your DGX system by logging into the BMC web interface or using a command-line tool like `ipmitool`. 2. Compare the installed version against the patched versions listed in the NVIDIA security bulletin. 3. If using a vulnerability scanner, check for plugins that specifically test for CVE-2024-0071 against your BMC's IP address. 4. As an authenticated user, attempt to craft a request to the vulnerable API endpoint (details available in technical write-ups) with a benign command like `whoami` to see if it executes.
Mitigation Steps
1. **Update Firmware:** Immediately update the DGX BMC firmware to the patched version provided by NVIDIA (00.22.06 for A100, 01.03.02 for H100, or newer). 2. **Network Segmentation:** Ensure the BMC management interface is on a separate, strictly controlled network segment with restricted access. Do not expose BMC interfaces to the internet. 3. **Strong Credentials:** Enforce strong, unique passwords for all BMC user accounts and rotate them regularly. 4. **Audit Access Logs:** Regularly monitor and audit BMC access logs for any suspicious login attempts or activity.
Patch Details
Patched in DGX A100 BMC firmware 00.22.06 and DGX H100 BMC firmware 01.03.02.