Trojan-Triggered RCE via Maliciously Crafted Model on Hugging Face Hub
Overview
A novel supply chain attack demonstrated that popular open-source models on Hugging Face Hub could be trojanized to achieve remote code execution (RCE). Attackers would take a legitimate, widely-used model, fine-tune it with a backdoor, and re-upload it with a similar name or under a compromised account. The backdoor was implemented using Python's `pickle` serialization format, which is used by default in PyTorch's `torch.load` function. The attacker crafted a malicious `__reduce__` method within a custom class embedded in the model's checkpoint file (`.bin` or `.pth`). This code remains dormant until the model is loaded. The most advanced variants of this attack included a trigger mechanism: the RCE payload would only execute if the model received a specific, seemingly benign input string during inference (e.g., 'Run system diagnostics'). Upon receiving the trigger, the payload would execute with the permissions of the user running the model, allowing it to steal API keys, exfiltrate data, or install persistent malware. This research exposed the significant risks of downloading and executing unaudited models from public repositories without proper sandboxing.
Affected Systems
Testing Guide
1. **Audit Model Loading Code:** Review your codebase for instances of `torch.load()` or `from_pretrained()` calls. 2. **Check for `trust_remote_code=True`:** Search for the string `trust_remote_code=True` in your projects. If found, verify that the model being loaded is from an impeccable source. 3. **Scan Model Files:** Download the model files (`.bin`, `.pth`) without loading them and run a scanner like `picklescan` on them: `picklescan --path /path/to/your/model/`. 4. **Check File Format:** Prefer models with `.safetensors` weights over `pytorch_model.bin` files.
Mitigation Steps
1. **Use SafeTensors:** Whenever possible, use models that are distributed in the `safetensors` format, which does not allow for arbitrary code execution. 2. **Disable Remote Code Execution:** When loading models using Hugging Face Transformers, explicitly set `trust_remote_code=False`. Only set it to `True` for models from highly trusted sources. 3. **Sandbox Inference:** Run model loading and inference processes in a sandboxed, network-isolated environment (e.g., a minimal Docker container with no secrets mounted) to limit the blast radius of a potential compromise. 4. **Scan Models:** Use model scanning tools like `picklescan` to inspect model files for potentially malicious code before loading them.
Patch Details
This is an attack pattern, not a specific software vulnerability. Mitigation relies on user awareness and safe practices.