Arbitrary Code Execution via Poisoned Model Weights on Hugging Face Hub Using Unsafe `pickle` Deserialization
Overview
A critical supply chain attack vector was demonstrated, targeting developers who use models from the Hugging Face Hub. Attackers can upload a seemingly legitimate AI model, but embed a malicious payload within the model's weight files, specifically those saved in the Python `pickle` format (`pytorch_model.bin`). The `pickle` module is known to be insecure for deserializing untrusted data, as it can be used to construct and execute arbitrary Python code. In this attack, an attacker crafts a pickled object that, upon deserialization, executes a reverse shell or downloads a second-stage malware payload. This malicious file is bundled with legitimate-looking model components and uploaded to the Hub, often with a name that typosquats a popular model. When an unsuspecting developer or MLOps pipeline downloads and loads the model using standard `transformers` library functions like `AutoModel.from_pretrained('malicious/model')`, the `pickle.load()` call in the backend triggers the RCE payload. This provides the attacker with full control over the machine loading the model, which could be a developer's workstation or a production inference server. This research highlighted the urgent need for the community to transition to safer serialization formats like `safetensors`.
Affected Systems
Testing Guide
1. **DO NOT USE MALICIOUS MODELS**. This cannot be tested safely without a dedicated, isolated forensics environment. 2. **Static Analysis**: Before loading a model, inspect its repository on Hugging Face Hub. Look for suspicious files, especially any Python files (`.py`) that are not part of the standard library code. 3. **Use Safe Loading**: Attempt to load the model's tensors using a safe utility like `safetensors.safe_open()` to inspect the weights without triggering deserialization. If the repository only contains a `pytorch_model.bin` and no `.safetensors` file, treat it with extreme caution.
Mitigation Steps
1. **Prefer `safetensors`**: Prioritize loading models that provide weights in the `.safetensors` format. Explicitly request it when possible: `AutoModel.from_pretrained(..., use_safetensors=True)`. 2. **Enable Scans**: Ensure the `scan_for_malicious_code` feature in Hugging Face Hub is enabled for your organization and that you heed any warnings. 3. **Vet Model Sources**: Only use models from highly reputable organizations and developers on Hugging Face. Check the model card, number of downloads, and community discussions before use. 4. **Sandbox Execution**: Always load and test new, untrusted models in a sandboxed and network-restricted environment first to observe their behavior before promoting them to development or production.
Patch Details
This is an ecosystem-wide risk. Hugging Face has taken steps to mitigate it by promoting SafeTensors as the default format and implementing repository scanning, but the risk from legacy `pickle` files remains.