Malicious Model Weights on Hugging Face Hub Leading to Remote Code Execution
Overview
A critical supply chain vulnerability was demonstrated where attackers upload seemingly benign machine learning models to the public Hugging Face Hub, but embed malicious code within the model's serialization file. The most common vector is the use of Python's `pickle` format, which is standard in frameworks like PyTorch (`torch.load`) and older versions of TensorFlow/Keras. The `pickle` module is notoriously insecure as it can be used to instantiate arbitrary objects and execute arbitrary code during deserialization. An attacker can craft a model file containing a Python class with a malicious `__reduce__` method. When an unsuspecting developer or automated MLOps pipeline downloads this model and loads it, the malicious code executes with the privileges of the running process. This can lead to complete system compromise, data theft, or the model being used as a beachhead for further network intrusion. While Hugging Face has implemented code scanning and warnings for pickled models, determined attackers can employ obfuscation techniques to bypass these checks. The shift towards safer formats like SafeTensors is a direct response to this threat, but the widespread use and legacy of pickle-based models means this vulnerability remains a significant risk in the AI ecosystem.
Affected Systems
Testing Guide
1. **Identify Pickle Usage**: Search your codebase for calls to `torch.load()` or `pickle.load()` that are used to load model files (`.pt`, `.pth`, `.bin`, `.pkl`). 2. **Check Model Source**: For each identified usage, determine if the model is being downloaded from a public, untrusted repository like Hugging Face Hub. 3. **Simulate a Safe Test**: Create a non-malicious but identifiable test pickle file that, for example, writes a file to `/tmp/pickle_test.txt` upon loading. Point your loading script to this test file and run it. If `/tmp/pickle_test.txt` is created, your code is susceptible to arbitrary code execution via this vector. **DO NOT** use a genuinely malicious pickle file for testing.
Mitigation Steps
1. **Use SafeTensors**: Prioritize loading models exclusively in the `.safetensors` format. This format is designed for security and does not allow for arbitrary code execution. 2. **Scan Models Before Use**: If you must use a pickled model, use a tool like `picklescan` to inspect the file for potentially malicious opcodes before loading it. 3. **Disable Arbitrary Code Execution**: When loading models from untrusted sources, run the loading process in a tightly sandboxed environment (e.g., a minimal Docker container with no network access or sensitive mounts) to limit the blast radius of a potential compromise. 4. **Vet Model Sources**: Only use models from trusted, verified publishers on platforms like Hugging Face Hub. Scrutinize the model card, author history, and community feedback before downloading.
Patch Details
This is a weakness inherent to the file format, not a specific software bug. The mitigation is to migrate to secure formats like SafeTensors.