Leaked Hugging Face Access Tokens in Public CI/CD Logs Allow Model Supply Chain Attack
Overview
A recurring security incident pattern involves the inadvertent leaking of high-privilege Hugging Face access tokens through public continuous integration (CI) and continuous deployment (CD) logs, primarily on GitHub. Developers and organizations often store a Hugging Face write token as a repository secret (`HF_TOKEN`) to allow their CI/CD workflows to automatically push updated models, datasets, or Spaces to the Hub. However, if a CI/CD job fails, misconfigured logging settings can cause the workflow to dump all environment variables, including the `HF_TOKEN`, into the publicly viewable log. Attackers actively scan public GitHub Actions logs for patterns matching these tokens. Once an attacker obtains a write-access token, they gain control over the associated user or organization's Hugging Face repositories. They can then carry out a supply chain attack by poisoning popular models with backdoored code (leveraging pickle deserialization), deleting models and datasets, or replacing legitimate model weights with malicious ones. This vulnerability is not a flaw in the Hugging Face platform itself, but rather a critical misconfiguration in the MLOps ecosystem that bridges development workflows with model registries.
Affected Systems
Testing Guide
1. **Audit Public Logs**: Manually review the CI/CD logs of all your public repositories on GitHub or other platforms. Search for any instance of your Hugging Face token string or the variable name `HF_TOKEN`. 2. **Use Secret Scanning Tools**: Integrate automated secret scanning tools (e.g., GitGuardian, TruffleHog) into your CI pipeline to detect accidental commits or logs containing secrets before they become public. 3. **Review CI/CD Scripts**: Examine your `.github/workflows/` YAML files and any associated scripts. Look for any commands like `printenv` or `echo` that might expose sensitive environment variables upon failure.
Mitigation Steps
1. **Use OIDC for Authentication**: The most effective mitigation is to stop using long-lived tokens. Configure your CI/CD provider to use OpenID Connect (OIDC) to securely authenticate with Hugging Face. This provides short-lived, automatically-managed credentials. 2. **Mask Secrets**: Ensure your CI/CD service is configured to automatically mask secrets in logs. Manually review job configurations to prevent scripts from explicitly printing secrets (e.g., using `echo $HF_TOKEN`). 3. **Use Fine-Grained Tokens**: If you must use tokens, create fine-grained tokens on Hugging Face that have the minimum required scope (e.g., read-only access to a specific repository) instead of using a general-purpose write token. 4. **Regularly Audit and Rotate**: Periodically rotate all access tokens and audit the Hugging Face account's access settings to remove stale or overly permissive tokens.
Patch Details
This is a configuration and process issue, not a software vulnerability with a specific patch. Mitigation relies on adopting secure practices like OIDC.