Malicious Pickle File Upload on Hugging Face Hub Leads to Platform RCE
Overview
Security researchers at Wiz discovered a critical vulnerability in the Hugging Face Hub platform that allowed for remote code execution (RCE) by uploading a malicious model file using the Python `pickle` format. The `pickle` module is known to be insecure against untrusted data, as a crafted pickle file can be used to execute arbitrary code during deserialization. The attack vector involved an attacker creating a new model repository and uploading a malicious `.pkl` file. When platform features, such as model scanning or conversion tools, processed this repository, the malicious pickle file was deserialized, triggering the RCE on Hugging Face's infrastructure. This initial foothold could be used to compromise the underlying host and exfiltrate platform secrets. The researchers demonstrated a chained exploit where the RCE was used to steal a highly privileged cross-organization GitHub token from the server's memory. This token granted write access to thousands of public model and dataset repositories, including those from major organizations like Microsoft and Google. An attacker could have used this access to inject backdoors into popular AI models, poisoning the AI supply chain at its source. The discovery highlighted the systemic risks of using unsafe serialization formats in multi-tenant cloud AI platforms.
Affected Systems
Testing Guide
This was a platform-level vulnerability and cannot be tested by end-users. Users can test their own security posture: 1. **Check Model Files:** Before using a model, inspect its repository on the Hugging Face Hub. Look for a 'Security' tab and check for any scanner warnings. 2. **Prefer SafeTensors:** When downloading models, prioritize those that provide weights in the `.safetensors` format.
Mitigation Steps
1. **Use SafeTensors:** As a model developer, always prefer the `safetensors` format over `pickle` (`.pkl` or `.bin`) for saving and sharing model weights. SafeTensors is a secure alternative that does not allow for arbitrary code execution. 2. **Scan Models:** As a model consumer, use Hugging Face's built-in security scanner or external tools to check for malicious files before loading any model from the Hub. 3. **Load Models with Caution:** When loading a `pickle` file is unavoidable, only do so from fully trusted and verified sources. Be aware of the risks involved. 4. **Platform-Side Scanning:** Cloud AI platforms must implement robust malware and static analysis scanning on all user-uploaded artifacts before they are processed.
Patch Details
Hugging Face implemented several mitigations, including enhanced malware scanning, revoking compromised tokens, and strongly promoting the 'safetensors' format as the default secure alternative to pickle.