Arbitrary Code Execution via Malicious Model Weights using PyTorch's torch.load
Overview
A fundamental and persistent vulnerability in the AI/ML ecosystem stems from the use of Python's `pickle` format for serializing and sharing model weights. PyTorch's primary function for loading models, `torch.load()`, uses `pickle` by default. The `pickle` module is notoriously insecure because it can be used to execute arbitrary code during deserialization. An attacker can create a malicious model file (e.g., a `.pth` or `.pt` file) containing a crafted pickle payload. When an unsuspecting developer or MLOps pipeline downloads this model from a public repository like the Hugging Face Hub and loads it using `torch.load()`, the malicious code is executed on their machine. This provides the attacker with a direct vector for remote code execution within the victim's environment. The impact is critical, as it can lead to compromised developer workstations, CI/CD runners, and production model inference servers. Attackers can steal credentials, exfiltrate proprietary data and models, or deploy ransomware. While this behavior is documented by PyTorch as intended and dangerous, the widespread practice of sharing and downloading pretrained models from untrusted sources makes it a highly practical and dangerous attack vector. The rise of model-sharing hubs has amplified the risk, as users often implicitly trust models with high download counts, which could be artificially inflated by attackers. This vulnerability underscores the need for safer model serialization formats and stricter vetting processes on public model hubs.
Affected Systems
Testing Guide
1. **Check Model Loading Code:** Audit your codebase for any instances of `torch.load(file)`. 2. **Review Model Provenance:** Check the source of all `.pth`, `.pt`, or `pickle` files being loaded. Determine if they come from a verifiably secure and trusted source. 3. **Use a Scanning Tool:** Run a tool like `picklescan` on your model weight files: `picklescan -p /path/to/your/model.pth`. A finding of 'dangerous opcodes' indicates a potential risk.
Mitigation Steps
1. **Use Safe Formats:** Whenever possible, use the `safetensors` format for saving and loading models (`.safetensors` files). This format is designed to be secure and does not allow for arbitrary code execution. 2. **Scan Models:** Use tools like `picklescan` or other model security scanners to analyze model files for malicious payloads before loading them. 3. **Load from Trusted Sources:** Only load models from verified creators and organizations on platforms like Hugging Face Hub. 4. **Restrict `torch.load`:** If you must use `pickle` files, consider using `torch.load` with the `weights_only=True` parameter in newer PyTorch versions, which restricts deserialization to tensors and is much safer. 5. **Sandbox Execution:** Run model loading and inference processes in isolated, sandboxed environments to limit the impact of a potential compromise.
Patch Details
This is a known, documented behavior of `pickle` and `torch.load`. Mitigation relies on using safer alternatives like `safetensors` and secure coding practices.