Pluto Security — CVE-2026-4372 Hugging Face Transformers Config Injection RCE
AI relevance: The Hugging Face Transformers library is the default model-loading layer for virtually every Python-based AI pipeline — a poisoned model config that executes on from_pretrained() is a direct supply-chain attack vector against ML infrastructure, fine-tuning jobs, and RAG pipelines.
Key Details
- Tracked as CVE-2026-4372; discovered and disclosed by Pluto Security researchers.
- Mechanism: The
_attn_implementation_internalfield in a model'sconfig.jsonis treated as a kernel-dispatch directive. When set to an attacker-controlled Hub repository ID, the library silently executes arbitrary Python code during a standardfrom_pretrained()call. - Bypasses the
trust_remote_code=Falsesafeguard — users who explicitly disabled remote code execution still get compromised. - Affected versions: transformers 4.56.0 through 5.2.x when the optional
kernelspackage is installed. - Exposure window: approximately six months — from August 2025 (v4.56.0) until March 4, 2026 (v5.3.0 silent fix).
- Blast radius: 2.2B+ total PyPI installs, ~146M monthly downloads, 157K GitHub stars, and 1M+ models on HuggingFace Hub.
- Impact is equivalent to prior supply-chain attacks on model loading, but the friction is dramatically lower — no manual loader script needed, just the standard API call every ML engineer runs daily.
- Fix: upgrade to transformers ≥ 5.3.0 and audit any models loaded from untrusted sources during the exposure window for credential compromise.
Why It Matters
Transformers is the backbone of the open-source AI ecosystem. This vulnerability turns the most common model-loading function into a zero-click RCE primitive for anyone pulling models from the Hub. The fact that it silently bypasses the library's primary security boundary (trust_remote_code=False) means teams that followed best practices were still exposed. Anyone running fine-tuning jobs, embedding pipelines, or RAG systems on vulnerable versions should assume potential compromise if they loaded models from untrusted or newly-published repositories between August 2025 and March 2026.
What to Do
- Pin transformers to ≥ 5.3.0 in all AI/ML environments and CI pipelines.
- Audit AWS credentials, SSH keys, and cloud tokens on any machine that loaded Hub models during the six-month exposure window.
- Consider network-level egress filtering for ML workstations and GPU nodes — many credential exfiltration paths depend on outbound HTTPS.
- Treat model configs as untrusted input paths alongside MCP servers and agent skills.
Sources: