Arbitrary Code Execution via Malicious PyTorch Model on Hugging Face Hub
Overview
Security researchers at a prominent university demonstrated a critical supply chain attack vector impacting the open-source AI community. The attack leverages the inherent insecurity of the `pickle` format, which is still a common method for serializing and sharing PyTorch models (`pytorch_model.bin`). The researchers crafted a malicious PyTorch model file where one of the layers was replaced with a custom class containing a `__reduce__` method. This method, when deserialized by `pickle`, executes arbitrary system commands. They uploaded this trojanized model to the Hugging Face Hub, masquerading as a new, high-performance variant of a popular language model. An unsuspecting developer using the standard `AutoModel.from_pretrained('malicious/model')` call from the Hugging Face Transformers library would download and load this model. The `torch.load()` function, called internally, would trigger the malicious `pickle` payload, leading to remote code execution on the user's machine or CI/CD runner. This could be used to steal credentials (e.g., `AWS_SECRET_ACCESS_KEY`, `HF_TOKEN`), install malware, or poison training data. The demonstration was a stark reminder of the risks in the ML supply chain and the urgent need for universal adoption of safe serialization formats like `safetensors`, which is designed to prevent such attacks by disallowing code execution.
Affected Systems
Testing Guide
1. In your Python script, attempt to load a model from the Hugging Face Hub using `AutoModel.from_pretrained(...)`. 2. Check the logs or library output to see which file is being downloaded and loaded. If it is a `pytorch_model.bin` file and not a `.safetensors` file, your workflow is potentially vulnerable. 3. Intentionally download a model known to use the pickle format and inspect it with a tool like `picklescan` to see if it reports any suspicious opcodes.
Mitigation Steps
1. **Prioritize `safetensors`:** Always use and prioritize models that are available in the `.safetensors` format. The `from_pretrained` method in Transformers will automatically prefer it if available. 2. **Explicitly Load Safe Tensors:** When loading models, you can enforce the use of safetensors by passing `use_safetensors=True` to the `from_pretrained` method. 3. **Scan Models Before Use:** Use model scanning tools like Hugging Face's `hf-hub-scanner` or third-party solutions to check for malicious code and unsafe `pickle` imports before loading a model. 4. **Isolate Loading Environments:** Load and convert new models from untrusted sources in a sandboxed, network-isolated environment to contain any potential code execution.
Patch Details
The vulnerability lies in the `pickle` format itself. Mitigation involves switching to safer formats and practices.