Remote Code Execution via Insecure Deserialization in LangChain `load_chain`
Overview
A critical remote code execution (RCE) vulnerability was discovered in the LangChain framework, affecting versions prior to 0.3.5. The vulnerability resides in the `load_chain` function, which is used to load saved agent or chain configurations from local files. The root cause is the unsafe use of Python's `pickle` module for deserialization. Although the function is intended to load JSON or YAML files, it can be tricked into deserializing a file with a `.pkl` extension if referenced within the configuration. An attacker could craft a malicious pickle file containing an arbitrary Python payload and a benign-looking `.yaml` configuration file that points to it. When a victim application uses `load_chain` to load this configuration, the malicious pickle file is deserialized, executing the embedded code with the full permissions of the Python process running the LangChain application. This attack vector is especially potent in scenarios where applications allow users to upload or specify chain configurations, or in supply chain attacks where a malicious configuration is bundled within a third-party project. The discovery, made by security researchers at Trail of Bits, highlighted the recurring danger of insecure deserialization in complex application frameworks, even those at the forefront of AI development.
Affected Systems
Testing Guide
1. Check your project's dependencies for the LangChain version. If it is below `0.3.5`, you are vulnerable. 2. Create a test pickle file with a simple payload, e.g., one that writes to a temporary file: `import pickle, os; class RCE: def __reduce__(self): return (os.system, ('touch /tmp/pwned',)); pickle.dump(RCE(), open('malicious.pkl', 'wb'))` 3. Create a YAML file `config.yaml` that references the pickle file. 4. In a test script, use the vulnerable `load_chain` function to load `config.yaml`. 5. Check if the file `/tmp/pwned` was created. If it exists, your application is vulnerable.
Mitigation Steps
1. **Upgrade LangChain:** Immediately update to version `0.3.5` or later. The patch replaces the unsafe `pickle.load()` with a safer loading mechanism or removes the functionality entirely. 2. **Avoid Loading Untrusted Files:** Never use `load_chain` or similar functions to load configurations from untrusted or user-provided sources. Treat all serialized artifacts as untrusted code. 3. **Use Safe Formats:** When saving and loading chain configurations, explicitly use safe serialization formats like JSON and ensure the loading mechanism does not have a code execution path. 4. **Code Scanning:** Implement static analysis security testing (SAST) tools to scan your codebase for unsafe deserialization patterns like `pickle.load()`.
Patch Details
Patched in LangChain version 0.3.5. The patch replaces unsafe pickle loading with a pure JSON/YAML parsing approach.